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Preface 


A  series  of  technical  reports  on  methodology  in  time  series 
analysis  with  applications  is  planned.  The  reports  will  be  issued 
sequentially ,  each  covering  an  important  aspect.  The  first  is  concerned 
with  Regression,  Trends,  Smoothing,  and  Differencing.  Subsequent  reports 
will  deal  with  forecasting,  autoregressive  integrated  moving  average 
models  and  statistical  techniques  based  on  them,  serial  correlation, 
and  spectral  analysis. 

The  purpose  of  this  series  of  technical  reports  is  to  develop 
the  most  modern  procedures  of  time  series  analysis  and  forecasting  for 
use  in  engineering,  the  physical  sciences,  and  the  social  sciences. 

It  is  expected  that  these  techniques  will  be  useful  in  scientific  and 
managerial  activities  of  the  Department  of  Defense.  The  exposition  of 
methodology  is  based  on  a  succinct  presentation  of  the  theoretical 
background  and  is  illustrated  with  appropriate  examples  from  engineering, 
maintenance  and  reliability,  economics,  and  other  physical  and  social 
sciences;  these  will  give  a  "real-life"  flavor  to  the  presentation. 

Much  of  the  material  in  these  technical  reports  is  based  on 
The  Statistical  Analysis  of  Time  Series  by  T.W.  Anderson,  the  prepara¬ 
tion  of  which  was  supported  by  the  Office  of  Naval  Research.  A 
preliminary  version,  written  by  N.D.  Singpurwalla,  was  used  in  the 
Stanford  University  course:  Statistics  20T.  Introduction  to  Time 
Series  Analysis.  It  is  assumed  that  the  reader  has  some  background  in 
mathematics  -  calculus  of  several  variables  and  introductory  linear 
algebra  -  and  a  basic  statistics  course. 


il 
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Methods  and  Applications  of  Time  Series  Analysis 

by 

T.W.  Andersen  and  N.D.  Singpurvalla 
Stanford  University  and  George  Washington  University 

Part  I:  Regression,  Trends,  Smoothing,  and  Differencing 

1.  Introduction 

1.1  What  Is  a  Time  Series? 

A  time  series  is  a  sequence  of  observations  which  is  ordered  in 
time  (or  some  other  entity  of  interest);  the  measurements  may  be  tempera¬ 
ture,  stress.  Gross  National  Product,  etc.  The  feature  of  a  time  series 
analysis  that  distinguishes  it  from  other  statistical  analyses  is  the 
explicit  recognition  of  the  fact  that  the  observations  arrive  according 
to  some  order.  While  in  many  situations  the  observations  are  assumed 
to  be  statistically  independent,  in  time  series  analysis  the  possible 
dependence  between  the  observations  is  given  prime  consideration. 

1.2  Examples  of  Time  Series 

In  many  areas  of  daily  interest,  there  are  phenomena  whose  evolu¬ 
tion  and  variation  with  the  passing  of  time  are  of  interest.  Examples 
of  these  are  the  weather,  the  price  of  gasoline,  the  level  of  industrial 
output,  the  state  of  one's  health,  and  the  popularity  of  classical 
music.  The  measurement  of  any  particular  characteristic  over  time 
constitutes  a  realization  of  a  time  series.  Often  we  are  interested 


in  measuring  several  charact eristics  over  time;  for  example,  an  electro¬ 
cardiogram  consists  of  several  records. 


A  very  pragmatic  objective  for  studying  a  time  series  may  be  to 


predict  the  future  based  upon  a  knowledge  of  the  past.  Another  objective 
may  be  to  obtain  an  understanding  of  the  mechanism  producing  the  series, 
so  that  we  may  be  able  to  control  it  and  obtain  desirable  results.  A 
less  pragmatic  objective  might  be  simply  to  obtain  a  succinct  descrip¬ 
tion  of  the  salient  features  of  the  series. 

1. U  How  Do  We  Take  Measurements  on  a  Time  Series? 

Even  though  many  quantities,  such  as  temperature  and  wind 
velocity,  change  continuously  in  time  and  can  sometimes  be  recorded 
continuously  in  the  form  of  a  graph,  very  often  in  practice  measurements 
are  made  in  discrete  time.  Digital  computers,  which  are  used  for 
an  analysis  of  time  series, accept  data  that  is  available  at  discrete 
points  in  time.  In  view  of  the  above,  we  shall  confine  ourselves  to 
time  series  that  are  recorded  discretely  in  time  at  regular  intervals, 
such  as  at  each  hour  on  the  hour  or  at  the  end  of  each  year. 

1. 5  What  Types  of  Time  Series  Do  We  Study  Here? 

We  shall  assume  that  the  measurements  that  we  make  in  a  time 
series  are  comprised  of  real  numbers  which  are  not  limited  to  a  finite 
(or  countable)  number  of  values.  That  is,  our  measurements  will  consist 
of  "continuous"  values.  For  example,  the  number  of  transistors  that 
we  manufacture  per  day  is  assumed  to  be  so  large  that  there  is  no 
sacrifice  of  reality  if  we  consider  it  to  be  a  continuous  variable. 
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Furthermore,  we  shall  only  consider  series  which  are  stable; 
that  is,  values  of  these  stay  within  certain  bounds  and  they  are 
changing  slowly  and  not  abruptly  such  as  shock  waves  caused  by  an 
underwater  explosion. 

1.6  A  General  Model 

Consider  T  equally  spaced  time  points  which  we  label  as 
1,2,3,...,T;  let  us  take  observations  on  the  time  series  at  these  time 
points  and  denote  these  observations  by  y^y^. . .  ,yT.  A  simple-minded, 
yet  fairly  general, model  for  the  times  series  can  be  written  as 

yt  =  f(t)  +  ut  ,  t  =  1,2, ... ,T  . 

That  is,  we  say  that  the  observed  series  is  made  up  of  a  completely 
deterministic  (determined)  part  f(t),  where  f(t)  is  some  function 
of  time  t,  and  a  random  or  a  stochastic  part  u^,  where  u^.  obeys  some 
probabilistic  law.  In  electrical  engineering  f(t)  is  often  referred 
to  as  a  signal,  and  u^  as  the  noise.  The  quantities  f(t)  and  u^ 
are  not  observable  by  us;  they  are  theoretical  quantities  which 
represent  our  abstraction  of  the  series  y^.  For  example,  if  the  y^ 
denote  measurements  on  the  daily  rainfall,  then  the  f(t)  may  represent 
the  long-run  average  rainfall  at  day  t  taken  over  many  years,  and  the 
u^  would  represent  the  daily  irregularities  which  describe  the  fluctua¬ 
tions  from  the  norm.  The  randan  part  u^  has  the  usual  "frequency” 
interpretation  that  we  use  in  statistics.  That  is,  if  in  theory,  we 
could  repeat  the  entire  situation  under  which  the  observations 
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yi»y2».--  .yT  were  obtained,  (in  reality  time  proceeds  progressively 
in  one  direction),  then  the  f(t)  would  be  the  same  as  before,  but 
the  random  terms  u^  would  be  different.  That  is,  the  various  values 
of  u^,  at  time  t  would  be  described  by  a  frequency  function.  Any 
errors  of  measurement  (which  are  usually  prevalent  in  many  physical 
situations)  will  be  included  in  the  u^ . 

Our  purpose  in  specifying  the  model  y  =  f(t)  +  u  is  to 

X  L> 

represent  the  mechanism  generating  the  observed  series  y  in  a  simple 
though  reasonable  manner.  All  the  same,  we  should  always  be  cognizant 
of  the  fact  that  the  model  is  only  an  approximation  to  reality. 

1.7  The  Regression  Function 

The  early  development  of  time  series  analysis  goes  back  to  the 
days  of  Gauss  who  developed  the  method  of  least  squares  for  the  analysis 
of  problems  in  astronomy.  In  such  models,  the  effect  of  time  was 
incorporated  only  in  the  systematic  part  f(t),  and  not  in  the  random 
part  u^..  Thus  it  was  assumed  that  the  u^  have  expectation  0,  that 
the  variance  of  u^.  was  a  constant  over  t,  and  that  the  u^  were 
uricorrelated  at  different  points  in  time.  The  systematic  part  f(t) 
was  a  known  function  of  time,  but  often  involved  unknown  coefficients. 

For  example,  f(t)  =  A  +  Bt,  where  A  and  B  are  unknown  constant; 
f(t)  is  also  known  as  the  regression  function. 

A  further  analysis  of  f(t)  involves  a  recognition  of  the  fact 
that  there  may  be  two  types  of  sequences  in  time.  One  is  a  slowly 
moving  function  of  time,  often  referred  to  as  a  trend,  and  is  exemplified 
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by  a  polynomial  in  t  of  a  fairly  low  degree.  The  other  is  a  cyclical 

function  of  time  which  is  exemplified  by  a  finite  sum  of  pairs  of  sine 

and  cosine  terms;  this  latter  is  called  a  Fourier  series.  For  example, 

we  could  write  f(t)  =  a  cos  At  +  8  sin  At,  0<A<it,  where  a,  3,  and 

A  are  constants.  We  notice  that  the  function  f(t)  repeats  itself 

after  t  has  gone  2ir/A  units  of  time;  2tt/A  is  therefore  called  the 

period.  The  reciprocal  of  the  period  A /2n  is  known  as  the  frequency. 
f 2  2 

The  quantity  p  =  /a  +  8  is  called  the  amplitude. 

Regression  analysis  and  the  theory  of  least  squares  deals  with 
methods  of  inference  for  the  unknown  coefficients  in  the  regression 
function  f(t). 

Suppose  now  that  our  model  for  the  observations  y^y^,...,  is 
y.  =  f(t)  +  u.  ,  where  f(t)  is  a  known  function  of  time  t,  and  the 

disturbances  ut  are  distributed  normally  and  independently  with  means 

0  and  variances  1.  Given  this  information,  we  point  out  the  fact  that 
a  knowledge  of  y1>y2 , . . . ,y^_1  does  not  give  us  any  help  in  predicting 
yt;  the  function  f(s),  s  >  t  -  1,  does  not  depend  on  y1,yg 
|  However,  if  f(t)  has  unknown  coefficients,  then  the  y^y^, . . .  »yt_1 

|  can  be  used  for  estimating  the  unknown  coefficients  in  f(t). 

. 

1.8  Stationary  Stochastic  Processes 

A  general  model  in  which  the  effect  of  time  is  represented  in 
the  random  part  u.  is  a  stationary  stochastic  process.  For  purposes 

v 

of  illustration,  we  shall  consider  what  is  known  as  an  autoregressive 


Suppose  that  y^  has  some  distribution,  say  normal  with  mean  0; 
let  the  Joint  distribution  of  y^  and  be  the  same  as  the  Joint 

distribution  of  y1  Mid  py.^  +  Ug,  where  P  is  some  constant,  and 


Ug  has  a  distribut 


y1  py.^  f  u 

ibution  which  is  i 


independent  of  y^  with  mean  0. 


We  shall  write  y^  =  py^  +  In  general,  the  joint  distribution  of 

y. »y0»- ■ • n*  y+  is  the  same  as  the  joint  distribution  of 

-L  (C  u“*-L  T/ 

yi’y2*‘ ' * ’^t  l’Pyt  1  +  Ut*  where  ut  is  distributed  independently  of 
y1,y2,... ,yt  and  has  mean  0.  If  the  marginal  distributions  of 
Ug,u^,...  are  identical  (and  the  distribution  of  y^  is  specified 
appropriately),  then  (y^^g,. .  •  ,yt)  represents  a  segment  of  a  stationary 
stochastic  process,  known  as  an  autoregressive  process,  and 


yt  =  Pyt-1  +  Ut 


is  known  as  a  stochastic  difference  equation  of  the  first  order. 


The  important  notion  conveyed  by  the  above  construction  is  that 
the  disturbance  term  u^.  has  an  effect  not  only  on  y^ ,  but  also  on 
the  subsequent  y^'s,  that  is,  yt+1»yt+2 , • • • •  Note  the  conditional 


expectation  of  y  ,  given  y^  ,y. 


t_i ’^t-2 ’ ’ ' ' ' l *  ls 


4(ytlyt-l»yt-2-**»yl)  =  Pyt-1 


Given  yt  1,yt  2,...,y1>  our  "besx",  in  the  sense  of  minimizing 


the  mean  square  error,  prediction  of  y  is  py^ 


We  observe  here 


that  for  this  model  a  knowledge  of  the  earlier  observations  assists 


us  in  predicting  y^. 


t.  .  i  . 
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1.9  More  General  Models 

There  are  many  situations,  particularly  those  involving  economic 
time  series,  wherein,  it  may  be  advantageous  to  incorporate  the  effect 
of  time  in  both  the  systematic  part  f(t),  and  in  the  random  part  u . 

w 

For  example,  if  a  series  consists  of  a  long-run  movement  and  a  seasonal 
variation,  then  f(t)  could  be  chosen  to  represent  these  features, 
and  a  u^  which  represents  other  irregularities  could  be  chosen  to  be 
an  autoregression  process. 

Given  a  model  for  a  time  series  in  which  the  effects  of  time 
are  incorporated  in  either  the  systematic  part,  or  the  random  part, 
or  both,  our  objectives  are  to  estimate  the  unknown  coefficients,  test 
hypothesis  about  these  coefficients,  decide  on  the  appropriate  order 
of  the  process  to  be  used,  and  to  predict  the  future  values  of  the 
process.  In  this  -  the  first  -  technical  report  we  treat  statistical, 
procedures  which  are  concerned  primarily  with  the  systematic  part 
f(t);  the  random  part  u  does  not  show  the  effect  of  time.  These 
procedures  are  useful  in  statistical  analysis,  and  many  of  them  are 
used  in  subsequent  approaches. 


-8- 


2.  The  Use  of  Regression  Analysis  in  Times  Series  Analysis 

Regression  analysis,  or  what  is  also  known  as  the  classical 
least  squares  theory  provides  us  with  many  of  the  techniques  that  are 
predominantly  used  in  time  series  analysis.  Thus  it  is  important  for 
us  to  present  a  brief  review  of  the  main  results.  The  independent 
variables  Eire  specified  functions  of  time,  such  as  powers  of  t  or 
trigonometric  functions  of  t.  As  stated  before,  the  random  terms 
Uj.  may  or  may  not  be  correlated  with  each  other,  and  may  or  may  not 
be  normally  distributed. 

2 . 1  An  Outline  of  the  General  Theory  of  Least  Squares 
Let  y1,y2,...,yT  denote  T  observations  on  a  time  series; 
assume  that  these  y'S  are  uncorrelated  and  have  means 

<2-!>  *  j,  Vit  •  ‘  *  na . T  • 

1=1 

and  variances 

(2.2)  S(yt  -  %rt)2  =  c2  ,  t  =  1,2, ... ,T  , 

where  the  z^'s  are  given  functions  of  t  and  are  called  the  independent 
variables .  The  y^'s  are  called  the  dependent  variables,  and  the  (L's, 
i  =  l,2,...,p,  are  p  unknown  coefficients;  £y  denotes  the  expected 
value  of  y^ . 
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If  B  =  •  •  »3p)’  denotes  the  (column)  vector  of  the  B^'s, 

and  if  =  ( z^t ,z2t ’ ' ’ ‘ ’Zpt ^ *  ^eno,tes  the  (column)  vector  of  the 
z  ’s,  t  =  1,2, ... ,T,  then  equation  (2.1)  can  be  written  in  compact  form 

ll 

as 


&yt  *  !’*t  * 

where  B'  denotes  the  transpose  of  the  vector  B* 

Suppose  that  b  =  (b^^, _ ,b^)  ’  is  any  estimator  of  the 

vector  B-  Then  given  the  values  y  ,y^, . . . ,yT,  our  objective  in  least 

squares  analysis  is  to  find  that  vector  b  in  the  class  of  all  vectors 

T  T 

A  1  A  2  i 

b,  such  that  7  (y  -  b'z.  )  is  minimized.  If  we  set  A  =  £  z  z ' , 

t=l  '  t=l't't 

T  _±  _1 

and  C  -  I  y.  z  ,  then  it  turns  out  that  b  =  A  ±C,  where  A  denotes 

t_i  t-t 

the  inverse  of  A,  is  the  least  squares  estimate  of  B.  We,  of  course, 
assume  that  A  is  non  singular,  and  thus  T  >  p. 

We  can  verify  that  b  is  an  unbiased  estimator  of  B,  and  the 
covariance  matrix  of  b  is 

&(b  -  B)(b  -  B)’  =  o2 A"1  , 

? 

where  a  is  the  variance  of  the  disturbance  terms  u^..  If  T  >  p, 

2 

then  an  unbiased  estimate  of  o  is 


2 

s 


T  T 

^  ^yt  "  -'?t^  ^  yt  "  b'Ab 

t»l  *  ~  t»l 


T  -  p 


T  -  p 


The  above  results  are  not  based  on  any  assumptions  regarding 
the  distribution  of  the  u  ’ s  (the  disturbances).  However,  if  we  assume 

T> 

that  the  y  's  are  independently  and  normally  distributed,  then  b  is 

w  ** 

2 

also  the  maximum  likelihood  estimate  of  &,  and  s  (T  -  p)/T  is  the 

2 

maximum  likelihood  estimate  of  a  .  Furthermore,  we  can  also  show  that 

the  vector  b  has  a  multivariate  normal  distribution  with  mean  vector 

2-1  .  . 

0  and  covariance  matrix  a  A  ;  we  shall  denote  this  distributional 

2  —1  2  2 
result  by  writing  b  'u  N(8,o  A  ).  Also,  the  quantity  (T  -  p)s  /o 

is  independent  of  b,  and  has  a  chi-square  distribution  with  T  -  p 

2 

degrees  of  freedom,  denoted  by  x  (T  “  p)- 

The  main  advantage  of  assuming  that  the  y  have  a  normal 
distribution  is  that  we  can  test  hypotheses  about  the  e^s  and  also 
obtain  confidence  regions  for  them.  For  details  about  these,  we 
refer  the  reader  to  T.W.  Anderson  (1971,  PP-  10-11). 

2.1.1  The  Case  of  Correlated  Dependent  Variables 
The  results  given  above,  apply  when  the  observations 


yj»y2***-»yT  8X6  no^  correlated.  That  is,  when 


-11- 


Sup  pose  now,  that  the  dependent  variables  are  correlated,  and 
that  the  covariance  matrix  is  known  to  within  a  constant.  That  is, 
suppose  that 


(2.3)  -  B'*t  ,  t  =  1,2,. ..,T  , 


(2.4)  &(yt  -  e'zt)(ys  -  B'zg)  =  cT*tg  ,  t,  s  =  1,2 . T  , 


where  the  i|k  are  known, 
ts 

We  shall  find  it  convenient  to  have  a  more  complete  matrix  nota¬ 
tion.  Let  y=  {y1,y2,-.  •  ,yT)',  Z  =  (z^Zg,. . .  tz^) ' ,  and  H*  *  l>tg] 
denote  the  column  vector  of  the  y^'s,  the  z^’s  and  the  matrix  of 

the  respectively.  The  matrix  Z  is  known  as  the  design  matrix. 

^8 


-  ZB 


&(y  -  Z6)(y  -  ZB)’  =  o'  V 


The  least  squares  estimator  of  S  is  given  by 


b  =  (z,'T1zr1z”r1y 


with  &b  =  8;  furthermore 


S(b  -  8)(b  -  8)’  *  o  ( Z’t'”  z) 
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An  unbiased  estimator  of  o  is  now  s  where 


(2.5)  s2  =  (y  -  Zb)  'H*  -1  (y  -  Zb)/(T  -  p)  . 

When  the  observations  y  are  uncorrelated  and  have  a  constant 

T/ 

2 

unknown  variance  a  ,  ¥  will  be  a  matrix  with  all  its  off  diagonal 
terms  equal  to  zero  and  the  diagonal  terms  equal  to  1.  When  this 
happens,  the  expressions  given  above  take  simpler  forms  and  agree 
with  the  corresponding  expressions  of  Section  2.1. 


2.2  Prediction 

Suppose  that  we  wish  to  predict  y^ ,  a  future  observation  at 
time  x,  where  t  >  T.  If  we  know  g,  then  we  know  the  regression 
function  and  so  fiy^.  =  g'z^  is  our  best  predictor  of  y  ;  note  that 
zt  =  (z1t ,z2t , . . . ,ZpT ) '  is  assumed  known.  In  practice,  g  will  not  be  known 
and  so  we  will  have  to  use  the  observations  yJL,y2,...,yT  to  estimate 
8  and  then  predict  yT  using  the  estimated  value  of  8. 

We  can  show  [Anderson  (1971),  p.  20]  that  the  best  (in  the 
sense  of  being  unbiased  and  of  minimizing  the  variance)  estimator  of 
Sy  is  b'z  ,  where  b  is  the  least  squares  estimator  of  8. 

Furthermore,  the  variance  of  our  predictor  b'zT  is 

S(b’zT  -  §’zT)2  =  o2z^(Z'  f*1  z)_1zT  , 
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When  the  observations  y  are  uncorrelated  and  have  a  constant 
2 

unknown  variance  o  ,  the  mean  square  error  of  prediction  becomes 
o2(l  +  z^A-1zt),  where  A  *  Z'Z. 

If  we  assume  that  the  y^'s  are  distributed  normally  then  a 
prediction  interval  for  y^  can  be  obtained  using  the  fact  [cf.  Anderson 
(1971),  p.  21]  that 


t  T  "I  ” 

s  /l  +  z'(Z'  V  Z)  z 

-T  -  ~  ~  -T 


has  a  Student  t-distribution  with  T  -  p  degrees  of  freedom.  Thus 
a  prediction  interval  for  y^  with  confidence  1  -  e  is  given  by 

(2.6)  b'?T  ±  tT_p(e)s  Jz'x(V  ^  1Z)_1zT  +  1 

where  s  is  obtained  via  equation  (2.5)  and  t,p_p(e)  Is  the  number 
such  that  the  probability  of  a  Student  t-distribution  with  (T  -  p) 
degrees  of  freedom  between  ± tT  (c)  is  1  -  e. 

When  we  cannot  assume  normality  of  the  observations,  the  above 
procedures  can  still  be  justified  for  large  samples  on  the  basis  of 
asymptotic  theory  (Anderson  (1971),  p.  23). 

2.3  An  Illustrative  Example 

We  shall  illustrate  our  use  of  the  methods  discussed  thus  far  by 
considering  the  following  example  involving  some  real-life  data.  We 
would  like  to  emphasize  that  our  main  objective  in  presenting  this 
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example  is  to  illustrate  the  mechanics  of  using  the  methodology.  Our 
goal  is  not  to  solve  realistically  the  practical  problems  posed  by 
the  example. 

In  Table  2.1,  we  show  some  data  (taken  from  Beals,  19T2,  p.  297) 
on  the  demand  for  money,  y  ,  for  year  t,  in  Jamaican  dollars,  for 
the  years  196l  through  1970.  Also  shown  are  the  data  on  the  gross 
national  product  Z.^,  and  the  treasury  bill  rate  Zg^.  For  the 


Table  2.1 

Money  Stock,  GNP,  and  Treasury  Bill  Rate 
for  Jamaica,  1961-1970 


Year 

Money  Stock 
in  millions  of 
Jamaican  dollars 

GNP 

in  millions  of 
Jamaican  dollars 

Treasury  Bill 
Rate 
% 

1961 

57.2 

1*85.5 

4.01 

1962 

55.6 

506.9 

4.93 

1963 

54.4 

51*0.2 

4.18 

1964 

62.6 

589.0 

3.27 

1965 

67.2 

637.1* 

4.35 

1966 

66.  4 

692.7 

4.1*4 

1967 

70.1* 

745.3 

4.4o 

1968 

82.8 

821.6 

4.71 

1969 

102.1* 

906.3 

3.33 

1970 

115.0 

981.3 

3.90 

'  purposes  of  this  example,  following  Beals,  we  shall  assume  that  the 
demand  for  money  (money  stock)  is  a  linear  function  of  the  gross 
national  product  and  the  treasury  bill  rate.  That  is 

yt  ~  60Z0t  *  »1ZH  *  62Z2t  *  "t  ’  1  ‘  1961 . 1970  • 

where  *  1  for  all  values  of  t,  and  8Q ,  B^,  and  B2  are  unknown 

constants. 

We  shall  treat  the  random  disturbance  terms  as  being  normally 

2 

distributed  with  mean  zero  and  an  unknown  constant  variance  o  . 

A  graph  of  the  data  of  Table  2.1  is  shown  in  Figure  2.1.  An 
inspection  of  Figure  2.1  reveals  the  fact  that  whereas  the  money  stock 
and  the  gross  national  product  increase  with  time,  the  treasury  bill 
rates  do  not  appear  to  reflect  an  upward  movement. 

In  vector  notation,  S  =  (Bq.B^^)  and  h  ■  (Z0fZlt>Z2t>'i 

thus,  our  model  for  the  money  stock  series,  is  y  *  B'Z  ,  with 

c  2  2 

and  £(y  -  £y  )  *  a  .  Our  data  on  the  money  stock 

t  A  x  it  t  t 

i*0 

is  y  *  (57.2,  55.6,. .. ,115.0) ,  and  our  design  matrix  is 

Z  «  (?196l,...,Z1970)',  where,  for  example,  «  (l,  1*85.5,  4.01)'. 

The  least  squares  estimate  of  6  is  b  =  (bg.b^.bg)  *  (Z'Z)  ^Z'y 

2  2 

■  (17.O59,  .1114,  -4. 961)’.  An  unbiased  estimate  of  a  is  s  , 
where  s^  *  (y’y  -  b’Ab)/(T  -  p)  *  263.3714/7  *  37.624,  where  A  ■  Z'Z. 

In  order  to  be  able  to  test  the  significance  of  the  estimated  coefficients 
b,  we  need  to  first  compute  their  standard  errors,  which  can  be  obtained 
from  s^A-*.  These  turn  out  to  be  6.134,  .0123,  and  3.893,  respectively. 


MONEY  STOCK  AGAINST  TIME 


TREASURY  BILL  RATE  AGAINST  TIME 

Figure  2.1.  Money  stock,  gross  national  product  and  treasury  bill  rates  for  Jamaica 


To  test  for  the  significance  of  these  coefficients,  we  compute  their 
t-statistic  values,  17.058/6.134  *  2.7809,  .1114/. 0123  =  9.057, 
and  -4.96/3.893  *  -1.27.  Since  the  length  of  the  series  is  small, 
the  significance  level  should  be  large,  say  1<#  or  so.  Economic  theory 
says  that  both  the  GNP  and  the  interest  rates  affect  money  supply, 
thus  a  test  of  significance  of  the  coefficients  8^  and  8^  should 
be  one  sided.  The  fact  that  8^  4as  a  negative  coefficient  makes 
economic  sense,  since  an  increase  in  the  treasury  bill  rate  will  tend 
to  lower  the  money  stock.  The  critical  value  of  the  t-statistic  with 
7  degrees  of  freedom,  for  a  one-sided  10^  level  of  signficient  test 
is  1.415.  Thus  the  coefficients  8Q  and  8^  are  clearly  significant, 
whereas  the  coefficient  82  is  nearly  significant.  In  view  of  the 
above,  together  with  the  fact  that  interest  rates  are  known  to  affect 
money  supply,  the  coefficient  B2  is  retained  in  the  model.  A  summary 
of  the  pertinent  test  statistics  is  given  in  Table  2.2. 

To  continue  with  our  analysis,  we  compute  the  "fitted  values" 

A 

y^,  where 

y  *  17.059  +  .1114  Zlt  -  4.961  Z2t  ,  t  =  1961, ...,1970  . 

These  values  are  shown  in  column  3  of  Table  2.3.  The  "residuals 
y  -  y  are  shown  in  column  4  of  Table  2.3;  they  are  plotted  in 

X  X 

Figure  2.2a.  In  Figure  2.2b  we  plot  y  versus  y .  ,  and  note  that 

10  .2 

the  two  compare  well.  The  residual  sura  of  squares,  £  (y.-y.  )  ■  263.371 
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Table  2.2 

Summary  Statistics  for  Regression  Analysis  of  Data  on 
Money  Stock,  GNP  and  Treasury  Bill  Rates  for  Jamaica 


Variable 

Coefficient 

Standard 

Error 

t-statistic 

Constant 

17.0585 

6.134 

2.781 

GNP 

.1114 

.0123 

9.057 

Treasury  Bill  Rate 

-4.9611 

3.893 

-1.27 

10 

y  *  l  y./10  =  73.1*00  ; 

t=l  X 

10 

ht  m  %  2it/10  “  690'6l9 


10 

l  (yt 

t=i 


;  z2t 


y)2  *  3813.28 


10 

I  z2t/io  *  4.152 

t=l 


R2  «  .931  . 


1 


1 
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Table  2.3 

A  Comparison  of  Actual  Versus  Fitted  Values  of  the  Money  Stock 

for  Jwnaica,  1961-1970 


Year 

Actual  Values 
of 

Money  Stock  y 

Fitted  Values  y 

Residua] 

yt  '  *1 

1961 

57.2 

51.25 

5.95 

1962 

55.6 

49.07 

6.53 

1963 

5>*. 4 

56.50 

-2.1 

1964 

62.6 

66.45 

-3.85 

1965 

67.2 

66.48 

.72 

1966 

66.4 

7 2.20 

-5.8 

1967 

70.4 

78.26 

-7.86 

1968 

82.8 

85.22 

-2.42 

1969 

102.4 

101.50 

.9 

1970 

115.0 

107.03 

7.97 

FITTED  VALUES  V.  RESIDUALS 
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this  quantity  when  suhstracted  from  the  original  stun  of  squares 
10  -2 

1  (y+  -  y)  =  3813.28,  gives  us  35^*9.909,  which  is  the  sum  of 
t=l  % 

squares  due  to  the  regression  of  y^  on  and  z^.  The  ratio 

2 

35^9.909/3813.28  =  .9309  is  the  R  value,  and  this  a  measure  of 
how  well  the  chosen  model  explains  the  variability  of  y  . 

Conclusion 

The  purpose  of  this  analysis  is  an  explanation  of  the  behavior 
of  the  money  supply,  rather  than  its  prediction,  since  in  practice 
we  do  not  know  in  advance  the  bill  rates.  Our  conclusion  is  that  the 
GNP  and  the  bill  rates  do  affect  the  money  stock,  in  a  manner  indicated 


by  the  model. 
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3.  Trends  and  Smoothing 

A  trend  is  a  broad  movement,  either  upwards  or  downwards,  in  a 
time  series.  Trends  are  generally  of  great  interest  to  businessmen  and 
economists.  Given  a  time  series,  we  may  wish  to  infer  the  presence  or 
the  absence  of  a  trend.  Often,  we  may  know  of  a  physical  or  an  economic 
principle  which  guides  us  in  the  specification  of  the  functional  form 
of  the  trend.  For  example,  in  reliability  theory  it  is  common  to  assume 
that  the  failure  rate  of  items  which  age  with  time  increases  as  either 
a  linear  or  a  quadratic  function  of  time.  [See  Mann,  Schafer, 
Singpurwalla  (1975).]  Thus,  the  estimated  failure  rate  which  constitutes 
a  natural  time  series  [Singpurwalla  (1975)]  should  contain  a  term  which 
accounts  for  this  trend.  However,  there  are  many  real  life  situations 
wherein  we  do  not  have  an  underlying  reason  for  specifying  the  general 
form  of  the  trend,  and  so  we  may  wish  to  approximate  it  by  a  polynomial 
of  a  very  low  degree. 

When  we  use  such  polynomial  functions  to  describe  the  trend,  we 
should  bear  in  mind  that  these  functions  are  approximations  to  an  unknown 
function  of  time.  The  true  unknown  function  may  be  much  more  complicated 
than  the  approximating  polynomial.  Consequently,  we  cannot  give  any 
real  physical  meaning  to  the  coefficients  of  the  polynomial.  Furthermore, 
the  polynomial  can  be  used  for  interpolation  only;  extrapolations  must 
be  made  with  great  caution. 


Following  Section  2,  let  us  assume  that  an  observation  y  , 

t  =  1,2,...,T,  is  the  sum  of  f(t)  which  is  a  trend  in  t,  and  an  error 

2  2 

(disturbance)  term  u  ,  where  £u.  =  0,  £u.  =  a  ,  and  &u  u_  *  0, 

t.  X*  T>  u  S 

t  4  s  .  Suppose  further  that  the  trend  is  a  polynomial  in  t  of  degree 
p,  where  p  <  T,  that  is, 

(3-1)  y  =  6n  +  B1t  +  B0t2  +  ...  +  B  tp  +  u.  . 

X>  U  X  c.  P  T» 

In  practice,  we  do  not  know  the  B's  nor  do  we  know  the  value 
of  p,  and  our  objective  is  to  choose  the  smallest  value  of  p  which 
is  consistent  with  the  observed  y^  and  (3.1).  A  point  of  view  that 
we  may  take  here  is  that  in  fitting  a  polynomial  trend,  or  for  that 
matter  any  other  regressive  function,  our  goal  is  a  reduction  of  the 
data.  If  we  set  ti_1  =  z,  i  =  l,2,...,p  +  1,  then  the  condition  given 

lw 

by  (2.l)  and  (2.2)  are  satisfied  by  the  above  model,  and  for  any  speci¬ 
fied  value  of  p,  we  can  use  the  method  of  least  squares  for  estimating 
the  B's.  However,  a  considerable  amount  of  simplification,  both  com¬ 
putational  and  statistical,  can  result  if  we  use  orthogonal  polynomials. 

3*1.1  Orthogonal  Polynomials 

We  say  that  column  i  of  a  given  matrix  is  orthogonal  to  column 
J  of  the  matrix  if  the  sum  of  their  corresponding  cross  products  is 
zero.  Our  aim  is  to  have  the  columns  of  the  design  matrix  Z  orthogonal 
to  each  other.  This  is  achieved  by  transforming  the  independent  variables 
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l.t.t^,. . .  ,tP  {all  of  which  are  powers  of  t)  to  a  set  of  what  are 
known  as  orthogonal  independent  variables  i)iQ^(t )  ,i()^,(t ) , . . .  ,<^pT(t ) , 
where 


I* 


t=l 


iT^‘Wt^ 


4  k 


i,k  =  0,1,. • • ,P 


For  purpose  of  illustration,  we  have  <t>Q,p(t)  =  1,  a  polynomial 
in  t  of  order  0,  (t)  =  t  -  { 1/2 ) ( T  +  l),  a  polynomial  in  t  of 

**■♦•••  •  .  mw  m 

p  •  •  ■  *  • «  * '  •  •  . . . 

order  1,  <j>2T(t)  =  t  -  (T  +  l)t  +  (T  +  l)( T  +  2 )/6,  a  polynomial  in  t 

or  order  2,  and  so  on.  The  coefficient  of  t  in  the  orthogonal 

polynomial  of  degree  k  is  often  taken  to  be  1. 

Fisher  and  Yates  (1963)  give  the  values  of  orthogonal  polynomials 

up  to  degree  5  for  T  up  to  75-  Another  source  for  obtaining  these 

polynomials  is  Biometrika  Tables  for  Statisticans ,  Vol.  1.  (in  these 

k 

tables  the  coefficient  of  t  is  taken  so  that  the  values  of  the  ortho¬ 
gonal  polynomial  of  order  p  are  integers;  examples  are  given  later.) 

Of  course,  tables  are  not  needed  to  carry  out  these  procedures  with  a 
computer  because  the  program  to  construct  orthogonal  polynomials  is  easily 
available. 

In  terms  of  these  orthogonal  polynomials,  (3.l)  can  be  written  as 

yt  *  Y0*0T(t)  +  VlT(t)  +  +  VpT(t)  +  Ut  * 

where  y0,...,y^  are  a  new  set  of  constants  which  are  linearly  related 
to  the  original  set  of  constants  6q,...,S  ;  in  fact,  the  last  coefficient 
Yp  equals  gp. 
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For  the  model  given  by  equation  (3.1),  the  design  matrix  Z  is 


^^ip(l)  ... 

^pT^) 

z  = 

^  0T  ^ 2  ^ 

*1t(2)  .  .  . 

tpT^) 

4>1T(T)  •  •  • 

If  we  write  a^  *  I  (+kT(t))  ,  then 

"tv-"  1 


z*z  = 


aoo  0 


0  a 


11 


0  0 


0 

0 

*PP 


the  orthogonality  property  making  the  off  diagonal  terms  equal  to  zero. 

If  we  set  Y  *  (Yq,Y^, . . . ,Yp) ' ,  then  the  least  squares  estimator 
of  y  is  Y  =  (Z’Z)  Z'y;  thus  an  unbiased  estimator  of  y  is 

**  w  **  **  ■*  K 


^  yt*kT^ 


Yk“ 


t=l 


t=l 


kT  (t) 


»  ^  =  0,1 , . . . ,p 


a  —1  2 

The  covariance  matrix  of  y  is  (Z'Z)  o  ,  from  which  we  see  that 


„  2 

Var  (yw)  *  “  ,  k  *  0,1,. ..,p  ; 

*  ®kk 


an  estimator  of  o  is 


s 


2 


T 

l  (yt 

t=i 


-  f 


k=0 


^k^kT^^ 


2 


T  -  p 


We  remark  that  the  above  particularly  simple  forms  of  our 

estimators  is  due  to  our  orthogonalization  procedure.  Another  advantage 

of  orthogonalization  is  that  the  estimates  y^  are  uncorrelated,  since 

the  off-diagonal  elements  of  (Z'Z)-1  are  0. 

If  we  assume  that  the  y^'s  are  normally  distributed,  then 

YQ,...»Y  are  independently  and  normally  distributed,  and  (T-p-l)s  /o 

2 

is  distributed  as  x  with  T-p-1  degrees  of  freedom  independently 

A  A 

of  Yqj • • • *Yp-  We  can  therefore  test  the  null  hypothesis  that  Yfc  *  0 
versus  the  alternative  that  ^  f  0  at  significance  level  a  by 
using  the  Student  t-distribution  with  T-p-1  degrees  of  freedom. 

That  is,  we  reject  the  hypothesis  that  y^  =  0  whenever 


(3.3) 


tT-p-l(°) 


* 


where  t  (a)  is  the  two-sided  a-significance  point  of  the  Student 
t-distribution  with  T-p-1  degrees  of  freedom. 


3.1.2  Determining  the  Degree  of  Polynomial  Trend 


Suppose,  for  the  moment,  that  we  have  reason  to  believe  that  the 
degree  p  of  the  polynomial  trend  is  at  most  a  specified  q.  Then  a  test 
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of  the  null  hypothesis  that  the  polynomial  trend  is  of  degree  less  than 
k,  given  that  its  degree  is  at  most  k,  for  any  k  <  q,  is  a  test  of 
the  null  hypothesis  that  Yfc  *  0  against  the  alternative  that  Yk  ^  0; 
this  test  can  be  performed  using  the  procedure  given  by  (3.3). 

In  practice,  we  do  not  know  in  advance  the  value  of  p  that 
we  should  use.  Our  inclination  is  to  use  as  low  a  value  of  p  as  is 
possible,  so  that  our  curve  f(t)  is  smooth  and  economical.  However, 
a  disadvantage  of  choosing  too  low  a  value  of  p  is  that  a  bias  is 
introduced  in  our  estimate  of  the  trend.  To  overcome  this  dilemma, 
suppose  that  we  have  some  a  priori  information  which  leads  us  to  believe 
that  the  lowest  value  that  p  can  take  is  m,  where  m  could  be  zero, 
and  that  the  maximum  value  which  p  can  take  is  some  value  q.  We  are 
now  confronted  with  the  problem  of  deciding  whether  the  degree  of  our 
polynomial  is  m,  m  +  1, . . . ,q  -  1,  or  q.  A  natural  strategy  would  be 
to  work  forward  by  starting  off  by  choosing  p  -  m  +  1,  and  then  testing 
the  null  hypothesis  that  Ym+1  *  0  using  (3.3);  if  this  hypothesis  is 
rejected,  then  we  test  the  hypothesis  that  Ym+2  =  811,1  continue  in 

this  manner  until  some  hypothesis  is  accepted  or  until  Yq  *  0  has  been 
rejected.  However,  as  is  discussed  on  p.  h2  of  Anderson  (1971),  thi3 
approach  could  lead  us  to  an  erroneous  decision.  A  better  procedure 
is  to  proceed  backward  by  starting  off  by  choosing  p  ■  q,  and 


then  testing  the  null  hypothesis  that  y^  =  0  using  (3.3);  if  this 

hypothesis  is  accepted,  then  we  choose  p  =  q  -  1,  and  test  the  null 

hypothesis  that  y  ,  =  0  using  equation  (3-3)  with  the  value  of  s 
q*.L 

being  recomputed.  Note  that  when  we  use  orthogonal  polynomials  our 
estimates  of  the  coefficients  do  not  change  when  we  go  from  p  =  q  to 
p  =  q  -  1;  this  is  another  advantage  of  using  orthogonal  polynomials. 

We  continue  in  this  manner  until  some  hypothesis  is  rejected  or  until 
Ym+1  =  0  has  'been'  accepVed*.*"If  in  the  above  (backward)  procedure, 
suppose  that  y^  j,  J  =  l,2,...,q  -  m  -  1,  is  the  first  hypothesis  to 

be  rejected.  Then  our  conclusion  would  be  that  the  polynomial  trend 

f(t)  is  of  degree  q  -  j ;  we  do  not  proceed  to  test  if  y^_^  1  =  0. 

3.1.3  An  Example  Illustrating  the  Fitting  of  a  Polynomial  Trend 
In  Table  3.1  (taken  from  T.W.  Anderson  (l97l),  p.  M»),  we  show 
yt  the  quantity  of  meat  consumed  per  year  per  person  in  the  United 

States  from  1919  to  19^1;  thus  T  =  23.  A  plot  of  y.  versus  t  is 

shown  in  Figure  3.1.  Our  aim  in  the  analysis  presented  below  is 
twofold: 

(i)  to  illustrate  the  methodology  for  fitting  polynomial 
trends,  and 


(ii)  to  attempt  to  make  some  comments  on  the  nature  of  the 
trend,  if  any,  based  on  the  fitted  polynomial. 
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Annual 

Table  3.1 

Consumption  of  Meat  in  the 
from  1919  -  19**1 

United  States , 

Time  Period 
1918  +  t 

Annual  Consumption 
of  Meat  y^ 

Fitted  Values  y  Using 

w 

A  Third  Degree 
Polynomial 

1 

171.5 

165.805 

2 

167.0 

169.1*56 

3 

16^.5 

171 . 927 

k 

169.3 

173.350 

5 

179.1* 

173.859 

6 

179.2 

173.585 

7 

172.6 

172.662 

8 

170.5 

171.223 

9 

168.6 

169.399 

10 

l61*.7 

167.325 

n 

163.0 

165.132 

12 

162.1 

162.951* 

13 

160.2 

160.923 

ll* 

161.2 

159.172 

15 

165.8 

157.833 

16 

163.5 

157.01*0 

17 

146.T 

156.925 

18 

160.2 

157.620 

19 

156.8 

159.260 

20 

156.8 

161.975 

21 

165.I* 

165.900 

22 

17U  .7 

171.167 

23 

178.7 

177.908 

ANNUAL  PER  CAPITA  CONSUMPTION  OF  MEAT 
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We  treat  the  y£s  as  being  independent  and  normally  distributed 

2 

with  an  unknown  variance  .  An  inspection  of  Figure  3.1  suggests 
a  cyclical  pattern;  motivated  by  this,  we  shall  suppose  that  the  trend 
(if  any)  can  be  described  by  a  polynomial  in  t  of  degree  at  least  3 
and  at  most  5-  A  polynomial  of  degree  3  being  a  cubic  function,  is 
suitable  for  describing  the  cyclical  pattern  in  the  data. 

We  start  off  by  fitting  a  polynomial  of  degree  5  using  the  model 


yt  Y0*0,23^  +  Y1  *1,23^  +  ***  +  Y  5^5, 23^^  +  ut 
t  =  1,2,. ..,23  , 


where  YQ,...,Y^  are  the  unknown  coefficients.  Our  design  matrix  is 


*0,23^  *1,23^  *  *  *  S,23^ 


jjj>0j23(23)  *l,23^23^  •  ‘  ‘  *5,23^23^ 


with  the  entries  2j(t),  t  *  1,...,23,  k  =  0,...,5,  given  in 
Table  3. 2, are  taken  from  Biometrika  Tables  for  Statisticans,  Vol.  1, 
p.  215.  The  entries  in  Table  3.2  are  the  values  of  the  orthogonal 
polynomials  with  leading  coefficients  Aq,...,Aj  chosen  to  make  the 
values  integers.  For  example,  when  T  «  23,  and  when  x(t)  ■  t  -  ^(T  +  l), 

*0,T^  *  X0  ’  *1,T^  *  Xix^t)  . 

*2,T^  "  ~  12  ~  » 


1 

1/ 


i 
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♦  ,  ^(t)  *  X,{x3(t)  -  ££•  (3T2  -  T)x(t)}  , 


3,T 


and  so  on,  where  the  values  Aq,...,!^  are  also  given  in  Table  3.2. 
The  advantage  of  using  the  above  formulas  is  that  the  orthogonal 
polynomials  are  displayed  in  integers,  with  the  result  that  rounding 
errors  are  avoided. 

Let  y  =  (yq . Ytj)’ ;  then  y  the  least  squares  estimator  of 

is  y  -  (Z'Z)  Z'Y;  thus  an  unbiased  estimator  of  y  is 

\  yt  *k,23(t)/.^<t'k,23(t)  •  k  =  °» •  • .  ,5  . 

t“o- 

We  find  Y  *  (166.191,  -.379051,  .073574,  .132745,  .001314,  .000052) 

2 

and  the  estimate  of  o  is 


23  5 

s2  «  l  (y  -  t  Yk*k  2  (t))2/(23  -  5  -  1)  *  25.303  . 

t=l  k=0  ’ 


In  order  to  see  if  the  polynomial  trend  has  degree  less  than  5, 

2 


|  Y ^  i  */&55/  /s  ,  where 


given  that  it  has  degree  at  most  5  we  compute 

23  , _ 

a  *  l  (<p  (t)r  -  340860;  thus,  we  have  ( .  000052 )  ( 583 . 832 )  //25 . 303 

5 5  t*l  - 

*  .00608.  Since  t^^.( . 05 )  ®/2^11,  we  accept  the  null  hypothesis  that 

Y^  *  0,  and  conclude  that^the  polynomial  trend  can  be  described  by 

a  polynomial  of  order7 less  than  5.  To  test  the  hypothesis  that  . 

23  U 

Yl  *  0,  we  compute  s2  *  \  (y  -  f  yt  (t)  )2/(23  -  4  - 1)  =  23.8973, 

„  _  t«l  1  k-0  *  K’23 

and  now  Iy^  I  ^a^/ Z23.8973  turns  out  to  be  .9738;  since  t^g(  .05)  ■  2.12, 


1 

4 


i 


Table  3.2 

Orthogonal  Polynomials  for  Pitting  a  Polynomial 
of  Degree  5  to  23  Observations 


t 

*0,23^ 

*1.23^ 

*2,23^ 

*3.23^ 

*4.23^ 

*5.23(t) 

1 

1 

-11 

77 

-77 

1463 

-209 

2 

1 

-10 

56 

-35 

133 

76 

3 

1 

-9 

37 

-3 

-627 

171 

we  accept  the  null  hypothesis  that  Y^  =  0.  We  now  carry  out  the 
computations  to  test  if  Y^  =  0;  the  appropriate  test  statistic 
works  out  to  be  4.93,  which  because  of  t^(.05)  being  equal  to 
2.131,  that  Y^  ®  0,  and  conclude  that  the  trend  can  be  described 
by  a  polynomial  of  order  3.  In  Table  3.2.1,  we  summarize  our  computa¬ 
tions  with  the  orthogonal  polynomials. 

Our  fitted  polynomial  is  therefore 

yt  =  l66.l9l*0.,.23(t)  -  . 379<J>1  23(t)  +  '.0742  23(t)V  .133<J>3  23(t) 

t  *  1,. . . ,23  . 

=  166.191  -  .379(t  -  12)  +  .07fc{(t  -  12)2  -  ~  x  528} 

+  {(t  -  12) 3  -  j^(3  x  529  -  7)(t  -  12)} 

-  160. 8k9  +  5.670t  -  .72kt2  +  . 022t3  . 

A  graph  of  the  fitted  polynomial  together  with  the  actual  values  of 
y  is  shown  in  Figure  3.2;  the  values  of  y  are  given  in  column  3 
of  Table  3.1.  It  will  be  observed  that  Figure  3.2  gives  a  good  fit, 
most  of  the  points  being  close  to  the  curve.  We  interpret  the  curve 
as  the  expected  or  normal  consumption  of  meat  if  it  were  not  affected 
by  year-to-year  irregularities.  The  fitted  thrid-degree  polynomial 
cannot  be  good  for  prediction  -  at  least  not  very  far  in  the  future. 
Far  to  the  right  of  this  data,  this  polynomial  increases  and  with  an 
increasing  slope;  even  without  the  effect  of  war,  it  does  not  seem 
reasonable  that  per  capita  meet  consumption  will  increase  indefinitely 
at  an  increasing  rate  of  increase. 


Computations  with  Orthogonal  Poloynomials  for  the 
Example  of  Meat  Consumption 


Figure  3.2.  Third  degree  polynomial  fit  to  data  on  annual 
consumption  of  meat. 
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Instead  of  using  cubic  terms  to  describe  the  cyclical  pattern 
in  the  data,  a  more  effective  way  might  be  to  use  Fourier  series 
(sums  of  sine  and  cosine  terms).  This  topic  will  be  considered  later 
on.  In  Example  2  of  Section  U,  we  will  reconsider  this  data  and 
illustrate  how  the  technique  of  smoothing  can  be  used  to  reduce 
its  variability. 


3-2  Smoothing 

In  order  to  estimate  the  trend  at  a  given  point  in  time,  it  may 
be  more  meaningful  to  consider  only  those  observations  which  are  in 
the  neighborhood  of  the  time  point  rather  than  all  the  observations, 
as  was  done  in  Section  3.1.  A  procedure  which  accomplished  this  is  known 
as  smoothing;  here  the  trend  at  a  given  point  in  time  is  the  weighted 
average  of  the  observations  in  the  vicinity  of  that  point.  If  we  smooth 
over  all  the  points  in  time  (except  the  first  and  last  few),  then  an 
irregular  graph  of  the  observed  points  will  be  replaced  by  a  smooth 
graph.  Specifically,  given  the  time  series,  y1»y2 , • • .  ,yT»  an  estimate 
of  the  trend  at  time  t  is  given  by 


s=m 

y?  =  I  c„y. 


s=-m 


s  t+s 


t=m+l,...,T-m  , 


where  m  is  some  suitable  constant,  and  the  c  's  are  weight  functions 

s 

m 

which  sum  to  1;  that  is  £  c  =  1.  The  observed  sequence  {y*}, 

S  L 

sa-m 
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t®m+l,...,T-m,  is  called  a  moving  average^ of  the  original 
sequence  {y^.}.  Since  we  have  assumed  that  y  *  f(t)  +  u^,  it  follows 
that 


m 


m 


=  l  cRy++R  ■  l  +  s>  +  csut+s} 


u  s^t+s  u  '  s 
s=-m  s=-m 


l  csf(t  +  s)  +  u*  , 


s=-m 


m 


where 


u*  =  7  c  n  .  . 

t  L  s  t+s 


s=-m 


m 


2  2  2  2  2 
Since  fiu  *  0,  fiu.  =  a  and  fiu u  =  0,  t  4  s,  fiu*  =  o  7  c  ; 

t  u  t  S  t>  s 

(  2  S=_m 
we  should  choose  the  c  's  in  such  a  manner  that  fiu*  is 

s  t 

2  m 

considerably  smellier  than  fiu  .  Furthermore,  since  gy*  =  £  c  f(t  +  s), 

X  t  s 

s=-m 

^r*  4  f(t);  and  so  unless  the  values  f(t  +  s),s=  l,...,m  are 

all  close  to  f(t)  (that  is,  the  trend  does  not  change  rapidly), 
the  smoothing  will  introduce  some  bias.  In  general,  the  smoothed 
sequence  {y*}  has  a  smaller  variance  than  the  original  sequence  {y^}, 
but  is  biased.  Another  important  consequence  of  smoothing  is  that  the 
successive  terms  in  the  smoothed  sequence  are  correlated,  even  though 
the  original  sequence  was  not.  Specifically,  for  any  h  >  0, 


&utut+h 


m  m 

I  l  °s  °i&ut+Su 


_  _  t+s  t+h+r 

s=-m  r=-m 
2  ? 

a  >  C  c  ,  ,  h  =  0,1,. . . ,2m 

L  . .  s  s-h 
s=-m+h 


,  h  3  2m  +  1, . . . , 


moving  average  process,  to  be  introduced  later,  is  mathematically 
equivalent  to  the  sequence  {y*},  but  usually  the  trend  term  is  absent 

and  it  is  the  random  sequence  {y*}  that  is  relevant. 
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As  an  example,  suppose  that  we  choose  cg  =  l/(2m  +  l);  that 
is,  we  take  the  arithmetic  average  of  the  2m  +  1  points.  Then 


m  m 

y#  s -  y  f(t  +  s)  +  — -  y  u 

*t  2m  +  1  _  '  '  2m  +  1  L  • 


s=-m 


s=-m 


t+s  * 


2  2  2 
where  &u*  =  o  /(2m  +l),  which  is  less  than  o  ,  and 


*u?u«  =  - ~o2  ,  for  h  =  0,1, ...,2m  , 

1  t+h  {2m  +  l)2 

=0  ,  otherwise 

Clearly,  the  variance  of  u*  can  be  reduced  by  choosing  m  large,  but 
then  |gy#  -  f(t)|  may  also  increase. 


3.2.1  The  Theory  Underlying  the  Smoothing  Procedure  and  Method 
for  Obtaining  the  Smoothing  Coefficients 

The  notion  underlying  the  smoothing  procedure  is  that  instead 
of  fitting  a  polynomial  of  degree  p  to  the  entire  set  of  data  (as 
was  done  in  Section  3.1),  we  fit  a  polynomial  of  degree  p  to  2m  +  1 
successive  values,  and  then  use  this  polynomial  to  estimate  the  trend 
at  the  middle  value.  Suppose  that  we  consider  the  2m  +  1  time  points 
around  t,  say  t,  t  ±  1,  t  ±  2 , . . .  ,t  ±  m,  and  suppose  that  the  trend 
at  these  time  points  f(t  +  s),  s  =  0,±l,...,±m,  can  be  approximated 
by  the  polynomial 

8q  ♦  B1s  +  B2s2  +  ...  +  8pSP  , 

s  =  0,±1,±2,. . .  ,±m  . 
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Then,  our  trend  at  time  t,  f(t),  is  approximated  by  8^,  where  8Q 
is  obtained  by  setting  s  =  0  in  the  above  equation.  Cur  estimate 
of  Bq,  say  8^  is  obtained  by  performing  a  least  squares  analysis  using 


the  observations  y 


provided  that  2m  +  1  >  p.  If  we  go 


t-m  ’•'t+m 

through  the  detailed,  though  straightforward, steps  of  this  analysis 

[Anderson  (I97l)>  P-  1*9],  then  we  see  that  8Q  is  of  the  general  form 
s=m  s=m 

T  c  y^ ,  ,  with  c  =  c  and  J  c  =1.  The  coefficients  c  are 

L  S  t+S  S  -S  L  S  S 

s — “in  s  “  -m  p  i 

polynomials  in  s  depending  on  m  and  k,  where  k  =  [p/2].—' 

For  example,  if  we  choose  p  =  0  or  1,  that  is, if  the  polynomial 

.  s=m 


is  of  degree  0  or  1,  then  8n  =  I  y. ,/(2m  +  l);  thus  c  =  (2m  + l) 

s=-m 

and  we  obtain  a  moving  average  with  equal  weights.  When  k  =  1,  that 
is, when  p  =  2  or  3, 


-1 


l 


m  [3(3m2  +  3m  -  1)  -  15s2]y 


t+s 


0  s=_m  (2m  -  l)(2m  +  l)(2m  +  3) 


In  Table  3-3  given  below,  we  list  the  values  of  c  's  for  k  -  1, 

s 

and  m  =  2,3,**  and  5. 

/*  / 
Since  BQ  is  obtained  via  a  least  squares  analysis,  we  should 

bear  in  mind  that  if  m  <  k,  8q  is  undetermined,  whereas  if  m  =  k, 

we  are  fitting  a  2m  +  1  degree  polynomial  to  2m  +  1  points  and 

thus  =  ^t*  that  is>  the  fit  is  perfect.  If  m  >  k,  the  moving 

average  is  nontrivial,  in  the  sense  that  it  involves  several  values  of 


'  t+s 


w 


[a]  denotes  the  largest  integer  less  than  or  equal  to  a. 


Table  3.3 


Coefficients  in  Smoothing  Formulas  for  k  =  1 


3.2.2  Some  Remarks  on  Smoothing 

The  main  purpose  of  smoothing  is  to  make  the  variance  of  the 
smoothed  sequence  y*  small  relative  to  the  variance  of  the  original 

sequence  y  .  We  have  seen  before  that  when  k  -  1  (p  =  0  or  l), 

—1  2  2 
c  =  (2m  +  l)~  ,  and  the  variance  of  y*  equals  a  /(2m  +1).  In 
s  t 

general,  it  can  be  shown  [Anderson  (1971),  Theorem  3.3-1]  that  the 

2  2 

variance  of  y*  equals  a  cn.  Since  the  smoothed  value  y*  can  be 

"t  U  w 

used  as  an  estimate  of  the  trend  at  time  t,  we  can  also  say  that  the 
main  purpose  of  smoothing  is  to  estimate  the  trend  of  y  with  minimum 


error.  The  error  consists  of  two  parts: 


s=m 


i)  the  bias  fi(y  -  y*)  «  f(t)  £  c  f(t  +  s),  and 
t  t  s 


s=m 


s=-ni 


ii)  the  random  part  u*  *  ][  c  u 

u  3 


s=-m 


t+s 
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For  a  given  value  of  p,  the  bias  goes  up  (in  most  cases) 

and  the  variance  of  the  random  part  goes  down  as  we  increase  m.  For 

a  given  value  of  m,  the  bias  goes  down  but  the  variance  increases 

as  we  increase  p.  We  are  therefore  faced  with  the  deciding  between 

the  values  of  m  and  p.  One  possiblity  may  be  to  choose  those  values 

of  m  and  p  that  minimize  the  mean  squared  error,  which  is  the  sum 

2 

of  the  variance  and  the  squared  bias.  However,  since  a  is  unknown, 

it  is  difficult  to  formulate  this  problem  mathematically  and  give  a 

satisfactory  solution.  Thus,  our  choice  of  m  and  p  must  be  made 

on  the  basis  of  general  experience  and  intuition. 

Another  difficulty  with  smoothing  is  that  to  obtain  y*,  the 

estimated  trend  at  time  t,  we  have  to  use  y  , ...,y  ;  thus  the 

u  “in  x>  'in 

first  smoothed  value  is  y*  and  the  last  y *  .  We  therefore  do 

m+1  T-m 

not  have  an  estimate  of  the  trend  at  the  beginning  and  at  the  end  of 
the  time  period. 

3.2.3  Examples  Illustrating  the  Practical  Value  of  Smoothing 
Example  1 

Bhattacharya  and  Klotz  (1966)  use  freezing  dates  and  thawing  dates 
of  Lake  Mendota  to  test  for  a  warming  trend.  Their  data  covers  a  period 
of  111  years,  namely  185I*  to  1965.  The  number  of  days  to  freezing  in 
each  winter  season  is  measured  from  Nov.  23.  The  winter  of  the  season 
is  t  *  1,...,111  with  t  *  1  denoting  l85l*-l855.  In  the  interest 
of  keeping  our  exposition  simple,  only  a  portion  of  the  data  is  considered 
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by  us.  The  abstracted  data  is  given  in  Table  3.U.  The  number  of  days 
to  freezing  measured  from  Nov.  23,  1853  +  t,  t  =  1,...,12,  is  denoted 
by  yt>  A  tendency  for  y  to  increase  with  t  is  interpreted  as 
a  warming  trend. 

In  Figure  3.2.1,  we  show  a  graph  of  y  versus  t;  this  graph 
reveals  a  large  amount  of  variability  which  tends  to  conceal  a  trend. 

We  shall  soon  see  that  a  graph  of  the  smoothed  data  (Figure  3.3)  reveals 
the  trend  more  conspicuously. 

For  illustrative  purposes,  let  us  choose  m  =  1  and  p  »  1; 
thus,  we  will  be  fitting  a  polynomial  of  degree  1  to  3  successive 
observations.  Note  that  the  cyclical  nature  of  the  data  suggests  that 
we  consider  small  values  of  m,  unless  our  goal  is  to  overcome  the 
cycles  and  just  look  for  a  linear  trend.  For  these  values  of  m  and 
p,  our  smoothed  sequence  is 


s=m 


=  l  ’  t  =  m  +  1,...,T  -  m  , 


't  **  s't+s 
s=-ra 


s=*m 


2m 


1 -  l 

+  1  __ 


s=-m 


t+s 


,  t=m+l,...,T-m  , 


*  3  sJ_iyt+s  ’  t=2»-*-»11  • 


In  Table  3.1* ,  we  show  the  smoothed  values  y*  obtained  by  using 
the  above  formula. 

In  Figure  3.3,  we  show  a  plot  of  y  and  y*  versus  t.  Note 
that  the  plot  of  y*  shows  less  fluctuations  than  the  plot  of  y^;  this 
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Table  3.4 


Observed  and  Smoothed  Values  of  the  Number  of  Days  to 
Freezing  of  Lake  Mendota 
(from  Nov.  23)  1854 , . . . ,1865 


Time  Period  t 

1853  +  t 

Number  of  Days  to 
Freezing,  yt 
(from  Nov.  23) 

Smoothed  Values  y* 
m  =  1 ,  p  =  1 

Smoothed  Values 
m  *  2 ,  p  =  1 

1 

25 

2 

i 

13 

13.3 

1 

1  3 

2 

10.0 

13.8 

1  u 

1 

15 

10.3 

13.0 

5 

l4 

16.7 

12.2 

6 

21 

14.7 

18.4 

7 

9 

21.0 

20.4 

1 

i  8 

33 

22.3 

20.6 

9 

25 

24.3 

20.6 

10 

15 

20.3 

23.8 

!  “ 

21 

20.3 

i  12 

! 

1 

1  • 

25 

NUMBER  OF  OAYS  TO  FREEZING 


is  because  the  effect  of  smoothing  is  to  reduce  the  variability  of  the 
y^.  The  main  advantage  of  smoothing  this  data  is  brought  about  by 
the  fact  that  the  y* 1 s  reveal  an  upward  trend. 

w 

t 

An  examination  of  the  entries  in  Table  3.1  shows  that  the  smoothing 
is  accomplished  by  pulling  closer  the  values  of  y  .  For  example,  the 

v 

values  y*,  y#,  and  y£  are  closer  together  than  the  values  of  y^,  y^ 
and  y^  the  value  y^  =  2  being  changed  by  smoothing  to  y*  =  10. 

We  could  continue  to  smooth  this  data  further  by  choosing  other 
valuess  of  m  and  p.  For  example,  if  we  choose  m  *  2  and  p  =  1, 
then  the  smoothed  values  y**  will  be  brought  still  closer;  these 
are  shown  in  column  1*  of  Table  3.1. 

Example  2 

We  shall  now  reconsider  the  data  of  Table  3.1  on  y  ,  the  annual 
meat  consumption  in  the  U.S.  in  year  t;  this  data  is  graphed  in 
Figure  3.1.  Recall  that  a  polynomial  in  t  of  degree  3  provides  us 
with  a  good  description  of  the  cyclical  pattern  in  this  data.  We 
shall  now  illustrate  how  the  variability  in  this  data  can  be  reduced 
by  smoothing,  with  the  result  that  the  cyclical  pattern  becomes  more 
conspicuous.  We  choose  m  =  2,  and  p  =  1  in  our  smoothing  formula, 
and  show  the  smoothed  values  y*  in  column  3  of  Table  3.5*  also  shown 
in  column  2  of  Table  3.5  are  the  actual  values  y^.  In  Figure  3.3.1 
we  plot  the  actual  and  the  smoothed  calues  versus  t.  It  is  of  interest 
to  compare  the  plot  of  y*  versus  t,  and  the  plot  of  the  third  degree 
polynomial  fit  y^  versus  t  given  in  Figure  3.2.  It  appears  that  y 
is  closer  to  y^  than  the  original  values  y^. 


■ 
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Table  3-5 

Observed  and  Smoothed  Values  of  the  Annual  Consumption  of 
Meat  in  the  United  States  from  1919-1941 


\ 

\ 

\ 

\ 

\ 


Time  Period  t  Annual  Consumption  Smoothed  Values  y 

1918  +  t  of  Meat  y  m  =  2 ,  p  =  1 


1 

171.5 

- 

2 

167.0 

- 

3 

164.5 

170.34 

1* 

169.3 

171.88 

5 

179.4 

173.00 

6 

179.2 

174.20 

7 

172.6 

174.06 

8 

170.5 

171.12 

9 

168.6 

167.88 

10 

164.7 

165.78 

11 

163.0 

163.72 

12 

162.1 

162.24 

13 

160.2 

162.46 

l4 

161.2 

162.56 

15 

165.8 

159.48 

16 

163.5 

159.48 

17 

146.7 

158.6 

18 

160.2 

156.8 

19 

156.8 

157.18 

20 

\  156.8 

162.78 

21 

\  165.4 

166.48 

22 

\l74.7 

- 

23 

178,7 

- 

c+  * 


ANNUAL  PER  CAPITA  CONSUMPTION  OF  MEAT 


•  ]  - 


I 


Figure  3.3.1.  Observed  and  smoothed  values  of  the  annual  consumption 
of  meat  In  the  United  States  from  1919-1941. 


TIME 


3.2.4  Smoothing  and  Seasonal  Variation 


Many  economic  time  series  have  a  seasonal  factor,  then  we  want 
to  use  the  model  &y  *  f(t)  =  g(t)  +  h(t),  where  h(t)  is  a  trend 
and  g(t)  is  a  periodic  function  of  period  n.  For  example,  n  =  4 
when  we  are  dealing  with  quarterly  data  such  as  dividends  of  A.T.&T. 
stock,  n  =  12  when  we  are  dealing  with  monthly  data  such  as  the  con¬ 
sumer  price  index,  and  n  =  52  when  we  are  dealing  with  weekly  data, 
such  as  the  yield  of  treasury  hills . 

The  defining  characteristic  of  the  periodic  function  g(t)  is 


g(t  +  n)  =  g(t) 


t  =  1,2, . . . ,T  -  n  . 


We  can  always  normalize  the  periodic  function  g(t)  so  that 
n 

£  g(t)  =  0;  that  is  we  can  center  the  periodic  function  about  0.  When 
t=l  n 

this  is  done,  we  note  that  £  g(t  +  s)  =  0,  s  =  0,1,..., T  -  n.  It 

t=l 

is  common  to  choose  T  to  he  a  multiple  of  n,  say  T  =  kn,  where  k 
is  some  integer. 

In  Figure  3.4(a)  we  illustrate  the  behavior  of  a  periodic 
function  centered  at  some  constant  C.  In  Figure  3.4(h)  we  Illustrate 
the  behavior  of  another  periodic  function  centered  at  0. 

Suppose  that  we  consider  a  moving  average  of  n  terms  (where 
n  is  the  period)  with  equal  coefficients;  that  is,  we  consider 
1  n  1 

—  7  y, .  .  Recall  that  the  weights  c  =  —  when  we  smooth  over  n 
n  **,t+s  s  n 

s=l 

observations  and  choose  p  =  0  or  1  (m  =  (n  -  l)/2.  Then 


FIGURE  3.4 A  SINUSOIDAL  KRIOOIC 


•  I 


•52. 


4  I 


n  n  .  n 

yt+s  =  „  l  (h('fc  +  s)  +  g(t  +  s))  =  —  l  h(t  +  s) 


s=l 


S=1 


S=1 


since  a  centering  of  the  periodic  function  g(t)  ensures  that 
1  n 

—  I  g(t  +  s)  =  0. 
n  s=l 

Thus  a  moving  average  of  n  terms  with  equal  weights  will  eliminate 
the  seasonal  variation  g(t),  whenever  the  period  of  g(t)  is  n.  This 
procedure  is  sometimes  proposed  to  eliminate  the  effect  of  a  cyclical 
movement  when  estimating  the  trend. 

If  n  is  even,  that  is,  if  n  =  2m,  we  use 


y 


# 

t 


X 

2m 


s=(m-l) 

l 

-s=-(m-l) 


t+s 


+ 


K-m 


+ 


1 

2yt+m 


t  =  m+1 , . . . ,T  -  m  . 


Then 


2m 

2m 


f(t  +  s)  +  if(t  -  m)  +  ~f(t  + 
h(t  +  s)  +  7j-h(t  -  m)  +  ^-h(t  + 


since  (l/4m)g(t  -  m)  =  (l/4m)g(t  +  m).  If  h(t)  is  changing,  slowly, 
then  will  be  close  to  h(t);  in  particular  if  h(t)  is  linear, 

fly*  will  equal  h(t). 

When  T  ■  kn,  we  can  define  g(t)  uniquely  by 


k— 1  T 

K<t)  =  i  l  f(t  +  nj)-;jr  [  f(s)  ,  t  -  l,2,...,n  . 

J=0  1  s=l 
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For  example,  the  seasonal  effect  of  December  is  the  difference 
between  the  average  of  all  Decembers  and  the  over-all  average.  Thus, 
an  estimate  of  g(t)  is 

,  k-1  ,  T 


A  y  v  _  A  T  v 

k  T  ■ 

Clearly,  this  estimator  is  unbiased 
as  follows: 

P>  k~!  i  T  1 

Var 


since  the  common  terms  in  the  above 
absorbed  under  one  summation.  Thus 


t  =  1,2,. . . ,n  . 

Its  variance  can  be  calculated 

-Var  [(i-i)TVnJ] 

r,  t  i 
+  Var  =•  l  ys 
L 1  s^t+nj  S  J 

two  summation  signs  have  been 

-  (f  -  ^)2^2  +  (|)2(T  -  k)o2 


3.3  The  Variate  Difference  Method 

The  variate  difference  method  is  sometimes  used  to  estimate  the 
variance  of  the  (uncorrelated)  error  term  u^  when  the  trend  f(t) 
is  smooth;  that  is,  when  f(t)  can  be  reasonably  well  approximated  by 
a  polynomial  of  a  low  degree.  Another  use  of  variate  differences  is  to 


test  for  the  lack  of  correlation. 
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In  the  variate  difference  method,  we  consider  successive  differences 
of  elements  in  a  time  series.  We  shall  need  to  define  an  operator  P 
where  Pu^.  =  ut+i’  *>or  *  =  ...,-1,0,1,....  Given  a  sequence  {u^.}, 
t *  ... ,-1,0,1,. .. ,  P{u.fc} = {ut+1? ;  thus  the  operator  P  when  applied  to  a 
sequence  gives  us  another  sequence  in  which  the  subscripts  are  shifted  by  1, 
P  is  said  to  be  a  linear  operator  because 


i)  for  each  sequence  {u^},  and  each  real  number  c 


P(cu  }  =  cp{u  )  ,  and 

t  U 


ii)  for  each  pair  of  sequences  {u }  and  {v  } 

X>  b 


p{u.  +  V.  }  =  P(u,  }  +  P(v  } 
t  t  t  t 


We  shall  define  P°ut  =  u^  and  Pn(u1_)  =  PCP11-1^),  n  =  2,3,...  ; 
thus,  we  can  verfity  that  Pn(u^)  =  u^+n>  Also,  by  definition 

nk 

(cp)ut  *  cP(ut),  and  (c1P  +  CgP  +  ...  +  c^P  )ut  =  c.jP  ut  + 


c/\  +  ...  +  ckP  \  =  c^^  ♦  c^^  +  ...  +  ckut+nk. 


Consider  the  forward  difference  operator  A,  where  Au^.  =  “  utV 


since  Au^  =  u^+1  -  u^  *  Pu^.  -  =  (P-l)ut  ,  it  follows  that  (P-l)ut=  Au^ 


or  that  A  =  P  -  1.  The  second-order  forward  difference  operator  A  u^ 


is  A  ut  -  A(Aut)  =  A{ut+1  -  ut)  =  Aut+1  -  Aut  =  ut+2  -  2^+1  +  V 


Equivalently,  A2ut  =»  (P  -  l)2ut  =  (P2  -  2P  +  l)ut  =  ut+2  -  2ut+1  +  ut. 


where  in  squaring  (P  -  1)  we  treat  P  as  a  real  coefficient.  In  general, 

r  r 


we  have  A 


A  *  <p  -  -  jo'-1'  "'P'S  ■  jo'-1'  J(P’W 


mmgarnmmm 


-55- 


Consider  a  polynomial  in  t  of  degree  p;  for-  example,  let 

2  Tj 

f(t)  =  °q  +  +  +  •••  +  apt  ,  shcI  take  its  first  forward  difference 

Af(t)  =  A(a  +  at  +  at2  +  ...  +  a  tp) 

0  1  d.  p 

=  aQ  +  o1(t  +  1)  +  a2(t  +  l)2  +  ...  +  a  (t  +  l)P  -  aQ 

2  n 

-  a, t  -  a„t  -  ...  -  a  tp 

12  p 

=  ai  +  +  1^2  “  t2^  +  •**  +  ap((t  +  l)P  -  tP) 

=  pa  tP_1  +  (constant )tP-2  +  ...  +  (a  +  a  „+...+  a  ) 
p  p  p-1  1 

Thus,  hy  taking  the  first  forward  difference,  we  have  reduced  the  degree 
of  the  polynomial  by  1.  In  general,  if  f(t)  is  a  polynomial  in 
degree  p,  then 

Arf(t)  =0  ,  r  =  p  +  1,  p  +  2,... 

Thus,  if  we  consider  a  trend  to  be  approximated  by  a  polynomial  of  degree 
p,  then  by  taking  p  +  1  or  more  forward  differences  of  the  trend,  we 
can  reduce  the  trend  to  0,  approximately. 

3.3.1  Taking  Differences  of  the  Observed  Series 
Suppose  that  we  have  an  observed  series  {y  }  which  will  be 
treated  as  composed  of  a  trend  f(t)  and  a  random  error  u^.  Since 
A  *  (P  -  1),  A  is  a  linear  operator. 


Ayt  =  A(f(t)  +  ut)  =  Af(t)  +  Aut  . 


If  f(t)  is  a  polynomial  in  t  of  degree  p,  where  p  <  r, 
then  Arf(t)  =  0;  thus  Ary  =  Arf(t)  +  Aru  =  Aru  .  It  now  follows  that 

X  X  X 

fiAryt  =  0  . 

This  is  important  because  it  shows  one  method  of  eliminating  a  (polyno¬ 
mial)  trend. 

The  differencing  also  affects  the  variance;  we  have 
Var  (Aryt)  =  Var  (Arut)  =  Var  ((P  -  l)rut) 


In  order  to  obtain  Var  ((P  -  l)u^),  we  note  that 


(P  -  l)rut  =  [Pr  -  (^P1"'1  +  ...  +  (~l)r]ut 


=  Ut+r  “  (l)ut+r-l  +  (-l}  Ut 


Thus  Var  (Ary  )  =  o2[ 1  +  (f)^  +  ...  +  (-l)2r]  =  o2(2r).  For  a  proof 
x  r 

of  the  last  equality  given  above,  ve  refer  the  reader  to  Anderson  (1971), 


p.6U. 


The  above  results  enable  us  to  propose  an  estimator  of  o  , 


say  vf ,  where 


l  (*ryt)2 

=  t=i 

"r  (T-r)(2rr) 
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2*.  Cyclical  Trends 

In  many  time  series,  especially  those  involving  economic  data, 
the  trend  f(t)  is  a  periodic  function  of  time;  that  is  f(t  +  $)  =  f(t), 
for  seme  period  Thus,  given  the  function  f(t)  on  any  interval 

of  length  <j> ,  we  can  determine  it  over  its  entire  range.  We  shall 
consider  here  the  analysis  of  a  time  series  when  the  trend  f(t)  can 
he  specified  in  terms  of  linear  combinations  of  sines  and  cosines;  this 
seems  like  a  natural  way  to  consider  periodic  trends.  However,  we  shall 
first  need  the  following  preliminary  notions. 

4.1  Transformations  and  Representations 

In  this  section  we  shall  see  some  alternate  ways  of  representing 
any  sequence  of  T  numbers  yi5y2,,..,yT  (not  necessarily  a  time  series), 
and  a  periodic  function  f(t).  In  Section  4.2  we  shall  return  to  the 
analysis  of  time  series  by  applying  some  of  the  techniques  discussed 
in  this  section. 

4.1.1  Trigonometric  Functions  and  their  Orthogonality 
The  trigonometric  functions  sin  t  and  cos  t  are  periodic  with 
period  2ir;  that  is 

sin  (t  +  2ir)  =  sin  t  ,  cos  (t  +  2n )  =  cos  t  . 

Furthermore,  for  any  k,  k  «*  0,±1,±2,..., 

sin  (t  +  2uk)  *  sin  t  ,  cos  (t  +  2uk)  *  cos  t  . 
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Suppose  that  we  wish  to  make  a  linear  transformation  of  the 
argument  t  in  sin  (•)  and  cos  (•)  by  multiplying  it  by  some 
constant  A.  Then,  sin  At  and  cos  At  will  also  be  periodic,  but 
the  period  is  now  2ir/A.  That  is, 

sin  (A(t  +  ^~) )  =  sin  At  ,  cos  (A(t  +  =  cos  At 

The  effect  of  A  is  to  expand  or  to  contract  the  time  scale;  small  values 
of  A  expand  the  time  scale  whereas  large  values  of  A  contract  it. 

( See  Figure  4.1.) 

The  reciprocal  of  the  period  is  called  the  frequency .  The 
frequency  need  not  be  integer  valued.  The  frequency  denotes  the  number 
of  periods  in  a  unit  interval. 

In  dealing  with  the  periodic  functions  sin  t  and  cos  t  ,  we  may  want 
to  shift  (or  translate)  the  entire  sine  or  the  cosine  curve.  This  is 
accomplished  by  the  introduction  of  a  shift  parameter  0.  More  specifically, 

sin  (A(t  +  -  0)  =  sin  (At  +  2tt  -  0 )  =  sin  (At  -  0)  ,  and 

A 

cos  ( A  ( t  +  —)  -  0)  =  cos  (\t  +  2tt  -  0 )  =  cos  (At  -  0) 

A 

Note  that  since  the  maximum  of  cos  At  occurs  at  At  =  2irk, 
k  =  0,±1,±2,...,  the  maximum  of  cos  (At  -  0)  occurs  at  At  *  2irk  +  0; 
that  is,  at  t  *  (0  +  2irk)/A.  The  angle  0  is  called  the  phase. 

Usually  we  choose  0  in  such  a  manner  that  the  first  maximum  occurs 
at  t  *  0/A.  At  t  =  0  the  function  is  either  cos  0  or  -sin  0. 


-59- 


In  Figure  i*.l  we  illustrate  the  above  properties  via  a  cosine 
function  with  a  frequency  of  1/2tt,  1/ ( Utr ) ,  and  1/n  by  suitable 
choices  of  A.  The  effect  of  the  phase  is  indicated  by  the  dotted 
lines;  we  have  chosen  6,9,  and  tt/2  for  the  phases. 

Since  cos  (a  -  b)  =  cos  a  cos  b  +  sin  a  sin  b,  we  have,  for 
any  constant  p , 

p(cos(At  -  0))  =  p(cos  At  cos  0  +  sin  At  sin  0) 

=  a  cos  At  +  0  sin  At  , 

2  2 

where  a  =  p  cos  0  and  0  =  p  sin  0.  Since  cos  0  +  sin  0=1, 

pop 

p  =  a  +  0  ,  and  since  tan  0  =  sin  0/cos  0,0=  tan  (0/a).  The 

maximum  value  of  the  function  p  cos  (At  -  0)  is  p;  p  is  therefore 

2 

called  the  amplitude  of  the  function.  The  quantity  p  is  called  the 
intensity. 

In  the  light  of  the  above  discussion  and  Figure  lt.l  we  remark 
that  by  a  suitable  choice  of  A ,  0 ,  and  p ,  we  can  obtain  any  desired 
shape  of  the  cosine  curve.  The  same  is  also  true  of  the  sine  curve. 

We  shall  make  use  of  this  geometric  property  of  the  trignametric  func¬ 
tions  in  Section  U.1.3,  wherein  we  approximate  a  periodic  function  f(t) 
by  an  infinite  linear  combination  of  sines  and  cosines  with  varying 
amplitudes  and  frequencies. 

The  Orthogonality  of  Trigonometric  Functions 

An  advantage  of  the  trigonometric  functions  is  that  they  exhibit 
a  certain  type  of  an  orthogonality  property.  Thin  property  makes  it 


I  S  C0S(1/2 1-0) 

COS  Xt-  COS  (1/2t) 

COSINE  FUNCTION  WITH  PERKJD4*-  (X  -  1 «).  FREQUENCY  1  /  <4W).  AND  PHASE  0  OR  0 
(THE  MAXIMA  OCCUR  AT  tK>.  4».  »r . OR  AT  20.  4W+20....) 


COSINE  FUNCTION  WITH  PER IOO  r  (X  -21,  FREQUENCY  1/»  AND  PHASE  0  AND  r/2 
(MAX  MM  A  OCCUR  AT  Xt-S.  Mr,  4W . OR  AT  W/4  2W  ♦  r/4....) 


FIQURE  4.1 


CURVE  WITH  VARYING  FREQUENCIES  AND  PHASE 


nmmmnttumii* 


convenient  for  us  to  work  with  them.  We  shall  merely  state  this  property 
here  and  refer  the  reader  to  Anderson  (1971)  ,  p.  94,  for  a  proof  of 
the  pertinent  results. 

Consider  a  series  of  length  T,  and  let 


1  T 

[|T]  =  2  ,  if  T  is  even 


T  -  1 


,  if  T  is  odd 


Then,  the  orthogonality  property  specifies  that 


a,  l 


21T  j  ,  2mk  .  _ 

cos  -^-t  cos  -^-t  - 


0  ,  0  <  k  t  j  <  [|t]  , 
|t  ,  0  <  k  =  i  <  |t  , 

T  ,  k  =  j  =  0,  or 


(4.2) 


J  cos  ^t  sin  =  0  ,  k,  j  =  0,1 . [|-T]  , 


,,  .  ?  .  2rrJ  .  2T[k. 

(4.3)  l  sin  -^-t  sin  — t  = 


'0  ,  0<k^J<  [|t]  , 

|t  ,  0  <  k  ■■■’  J  <  |t  , 


0  ,  k  =  J  =  0,  or  |T 


In  the  above  expressions,  we  are  considering  T  sums  of  cosine 
and  sine  functions  of  the  form  cos  At  and  sin  At,  where  X  is  to 
be  identified  with  2vj/T.  Since  the  frequency  of  cos  At  and  sin  At 
is  X/2it,  the  appropriate  frequencies  in  the  above  equations  are  J/T, 
j  =  0,1..., [jj-T],  and  their  periods  are  T/j.  When  (4.1)  through  (4.3) 


-Up¬ 


hold,  we  say  that  the  T  cosine  and  sine  functions  with  frequencies 
J /T  are  orthogonal  to  each  other. 


U.1.2  The  Fourier  Representation  of  any  Finite  Sequence  of  Numbers 
Consider  any  sequence  of  T  numbers  y^yg,.  •  •  where  T  is 
even.  These  numbers  need  not  be  the  observed  values  of  a  time  series. 

The  T  numbers  define  the  coordinates  of  a  point  in  a  space  of  T 
dimensions.  We  would  like  to  refer  to  this  point  in  another  coordinate 
system.  We  shall  do  this  by  using  the  orthogonal  trigonometric  functions 
discussed  above. 

Motivated  by  the  result  of  (U.l)-{4.3),  we  shall  define  a  T  x  T 
matrix  M  for  T  even  by 


1  2" 
Ji  C°S  T 


_  [2  1  2"  , 

"  /  T  ^  C°S  T 


/2  1 


sin  y 


2^ 

sin  —2 


1+7T 

COS  y2 


sin  y  (|t~1) 


sin  y  (yT  -  l) 


From  the  orthogonality  relationships,  we  have  M'M  =  I.  Let 
y  *  (y1»y2,...»yT)'  and  x  =  (x^Xg,...^)',  where  x  =  M'y.  Since 
MM'  *  I,  we  have  y  =  Mx,  where  x  =  M'y  gives  us 


_  1  r 

X1  =  —  Z  yt  > 

/ft'i  1 
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2k 


y  cos  -~t 


2k+l 


2fTk . 
T 


k  =  1,2,. 


-4* 

t=i 

=  /§  Iy t  sin  ’  k  =  1’2’' 


—  T  -  1 
*21 


,|T  -  1  , 


xT  =  ~  I  * 

1  /?  t=l 


Thus,  using  y  =  Mx  we  can  write  y^ ,  t  -  1,2,. . •  ,T  as 


(U.H) 


/O  ,  1  ^  . 

—  (  — x,  +  x0  cos  -=-t 
T  /r  1  2  i 


t  »  T 


2,‘  ♦  ...  ♦  • 


Equation  (U.h)  is  known  as  the  Fourier  representation  of 
y  ,y2,...»yT  with  discrete  Fourier  coefficients,  x1»x2» *  *  * »xr‘ 

When  T  is  odd,  we  go  through  an  analogous  development  except 


/ 

2it|(T  -  if 

,sin 

[&i4(T  -  1)  “I 
2 

sin 

T 

T  ~ 

\ 

L  J 

, . . . ,0 


and  now 


(J*-5) 


=  /!■[  — x,  +  x„  cos  ~-t  +  ...  +  X„sin 


2n 

T 


2ir~(T  -  1) 


t  =  1,2,. ..,T  . 


Periodic  Sequences 

In  time  series  analysis,  the  sequence  of  numbers  y-^Yg,- . .  ,yT 
may  be  periodic  with  a  period  n.  Thus 


t+n 


=  y+ 


t  =  1,2,. ..,T  -  n 
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for  some  integer  n.  Suppose  that  T  is  a  multiple  of  n,  say  T  =  hn. 
We  can  represent  this  sequence  in  terms  of  only  n  trigonometric  func¬ 
tions  as  follows : 

When  n  is  even,  then  for  t  =  l,2,...,n 


(U.6) 


yt  = 


'2/1  #  A  *  2tt 

-  (  — X*  +  X*  cos  —  t  + 
n  J7)  1  2  n 


+  x*^) 
n  ^ 


where 


.  n 

•I  ■  i . 


t=l 


t  ’ 


i 2  n 


„  /2  v  2jk 

x2k  =  /n  J/t  COS  ~n~t 


2k+l 


%  y 
n 


t=l 


2irk  , 

Sin  - 1 

n 


■J-ilV-n*  • 

/n  t=l 


k  =  l,...,[|{n  -  1)]  , 

k  »  l,...,[|<n  -  1)]  , 


Since  the  cosine  and  the  sine  functions  are  periodic 


yt+n  =  yt  = 


+  x» 

nl^Xl  x2 
n  /2  X1  2 


2tt  .  ,  „ 

(-!)*, 

/2 

n  n 

—  (t  +  n)  +  . . . 

+  x*-^ 

■Dt+n 

n 

n 

/2 

. ,T.  When  n 

is  odd. 

the 

term  involving  (-1)^  is  omitted. 


Consider  any  periodic  function  f(t)  whose  period  is  <f>;  let 


f(t)  be  defined  for  all  values  of  t.  Thus 
f(t)  =  f(t  +  <fr)  =  f(t  +  24> )  =  ...  . 

We  would  like  to  represent  f(t)  in  terms  of  sine  and  cosine 
functions  whose  period  is  also  $ .  That  is,  we  would  like  to  write 
f(t)  in  terms  of  an  infinite  linear  combination  of  cos  ^-0t(=  l), 

sin  0t(=  0) ,  cos  —  (t ) ,  sin  (t ) ,  cos  (2t ) ,  sin  ( 2t ) , . . . . 

For  some  constants  a^,  3^,  i  =  0,1,2,...,  let  us  consider  an  infinite 
series  of  the  form 


2n 

+  a  cos  — t 
1  <P 


o  .  1*TT  .  .  .  .  UlT  .  , 

ei  Sln  a2  cos  T  e2  Sln  T 


Our  motivation  for  considering  this  infinite  sum  should  be 

apparent  from  an  examination  of  Figure  h.2.  We  show  there  three  sine 

curves  with  periods  <j> ,  <t>/2,  and  <}> /U ,  and  amplitudes  3,  =  1,  3?  =  3/** 

3  x  d 

and  Bo  =  1/2  respectively.  The  sum  £  3.  sin  -7- (it),  shown  by 
i  i=l  1  * 

the  dotted  lines  of  Figure  U.2,  could  be  considered  as  an  approximation 

to  some  f(t).  However,  by  a  suitable  choice  of  3^,  i  *  1,2,...,  the 
00 

infinite  sum  £  3.  sin  —(it)  would  be  a  better  approximation  to 

i=l  1  * 

f(t).  A  similar  type  of  an  argument  leads  us  to  consider  an  infinite 
•  2it 

sum  of  the  form  T  a.  cos  -—(it). 

i=0  1  * 

In  general,  we  can  consider  an  infinite  sum  of  the  pairs  of  sine 


and  cosine  terms 
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y  (a,  cos  ~  Jt  +  Q,  sin  —■  jt) 
j=0  J  +  J  ♦ 

to  provide  us  with  a  still  better  approximation  to  f(t). 

If  the  infinite  series  converges  to  f(t)  for  some  value  of  t, 

2tt 

then  it  also  converges  to  f{t  +  $)  since  cos  —  k(t  +  <{>) 

<P 

and  sin  -y  k(t  +  <j>)  are  also  periodic  functions.  Thus  the  infinite 
sum  of  trigonometric  functions  given  above  is  also  periodic  with  period  <}>. 
We  can  verify  using  the  trigonometric  identities  [see  Anderson  (1971), 
p.  100]  that  the  functions  in  the  infinite  series  have  the  following 
properties:  , 

J  cos2  £jAt  dt  *  |  sin2  ~^t  dt  -  y*  ,  j  4  0  ; 

0  *  0  +  2 

A  A 

r  2ir  j  .  2irk  .  r  .  2tt1  .  .  2itk.  ..  n  ,  . 

J  cos  — r“-t  cos  — —  t  dt  ~  I  sin  — r“-t  sin  — —  t  dt  =  0  ,  J  f  k  ; 

0  *  *  0  ♦  * 

|  cos  ^J-t  sin  ^7^-t  dt  =  0  ,  all  j  and  k 

0  ♦  * 

Under  some  very  general  conditions  on  f(t),  Fourier  analysis 
tells  us  that  the  infinite  series  considered  here  converges  to  f(t) 
at  every  continuity  point  of  f(t).  When  this  happens,  and  if  term 
by  term  integration  is  permissible,  then 
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/f(t)  cos  ^p-tdt  =  aQl  cos  ^—-t  dt 

<£  OO 

+  /  I  (a,  cos  ^-t  +  B  sin  ^-t)  cos  ^-t  dt 
0  j=l  J  <P  J  t  $ 


for  k  i  0 


The  above  equation  determines 


as 


4> 

(4.7)  \  =  f  /f^t)  cos  ~t  dt  ,  k?<0  . 


Also 


T 

“o  =  <T  /  f(t)dt  . 


In  a  similar  manner,  multiplication  by  sin  ^ElL  t  gives  us 


<**-9)  Pk  =  f  I  f(t)  sin  ^t  dt  ,  k  *  0  . 


When  the  Fourier  coefficients  a^,  a^,  and  6^  are  chosen  according 
to  equations  (4.7),  (4.8)  and  (4.9)  the  series  is  said  to  represent  f(t). 
In  time  series  analysis,  the  function  f(t)  is  used  as  a  trend 

function  in  an  error  model  of  the  form  y  =  f(t)  +  u  .  Thus  only  the 

^  “t 

values  of  f(t)  at  t  =  1,2,...,T  are  relevant.  If  f(t)  is  assumed 
to  be  periodic  with  period  n,  where  n  is  an  integer,  then  only  the  n 
values  f(l),  f(2) , . . . ,f(n)  appear  in  our  analysis.  In  such  a  case  the 


'-z*rn? 


by  a  linear  combination 


function  can  be  represented  at  t  *  1,2,..., 
of  n  trigonometric  functions  as  was  done  in  Section  it.  1.2. 

Summary  of  Section  4.1 

The  main  point  of  the  discussion  in  Section  4.1  is  that  any  finite 
sequence  of  observations,  or  any  periodic  function  f(t),  can  be 
alternatively  represented  by  a  linear  combination  of  trigonometric  func¬ 
tions  with  different  frequencies.  Specifically, 

1)  Any  finite  sequence  of  T  observations  y^.y^,. .  •  ,y^,, 
periodic  or  otherwise,  can  be  transformed  into  a  finite  set 
of  Fourier  coefficients  x^x^, . . .  ,x^,  by  the  use  of  the 
orthogonal  matrix  M. 

2)  Any  periodic  function  f(t)  defined  for  all  values  of  t 

can  be  represented  by  an  infinite  sum  of  Fourier  terms  whose 
coefficients  and  B^  are  the  trigonometric  integrals 

given  by  equations  (4.7)  through  (4.9). 

The  need  for  these  alternate  representations  will  be  evident 
in  Section  4.2  wherein  we  will  discuss  the  statistical  estimation  of 
cyclical  trends.  The  coefficients  x^,...,x^,  the  and  the  Bk 

can  be  given  some  physical  interpretations;  these  too  will  be  discussed 
later.  However,  before  we  proceed  to  Section  4.2,  we  shall  first 
illustrate  the  methodology  of  this  section  via  some  examples. 

4.1.4  Examples  Illustrating  the  Use  of  Fourier  Representations 
We  shall  consider  here  three  examples.  The  first  two  examples 
pertain  to  observations  from  a  real  life  time  series,  and  the  third  one 


pertains  to  an  arbitrary  function.  The  first  two  examples  illustrate 
the  methodology  of  Section  4.1.2  whereas  the  latter  illustrates  the 
methodology  of  Section  4.1.3. 


Example  1 

In  Table  4.1  we  present  some  data  on  Wolfer's  Sunspot  Numbers 
rounded  to  the  nearest  integers  from  the  year  1911  through  1933.  These 
numbers  have  been  taken  from  Table  A. 3.1  of  Anderson  (l97l).  We  would 
like  to  obtain  a  Fourier  representation  of  these  numbers  denoted  by  us 
as  y^,  for  t  =  1,2,. ..,33.  A  graph  of  y  versus  t  is  shown  in 
Figure  4.2.1. 

We  will  have  to  obtain  33  Fourier  coefficients  x^ ,x^ , . . . .x^  in 
order  to  obtain  the  desired  Fourier  representation.  We  shall  first 
compute 


x 


x 


X 


33 


1  =  ~Z  l  yt  =  255.894 
/33  i=l 


33 


2k 


/  2  v  2rrk „ 

"  /  33  J/t  COS  T" t 


t=l 


-L  fy  sin  2nkt 

2k+l  /  33  t£1yt  T 


k  =  1,2,... ,16, 

,  k  =  1,2, . . . ,16  . 


The  computed  values  of  x^,...^^  are  given  in  column  4  of  Table  4.1. 
Since  T(=  33)  is  odd,  the  Fourier  representation  of  y  ,  t  =  1,2,..., 33 
is  given  as 


Table  4.1 

Wolfer's  Sunspot  Numbers  and  Their  Fourier  Representation 
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The  values  of  y  computed  by  using  the  above  formula  are 
given  in  column  3  of  Table  4.1.  We  note  that  these  values  are  quite 
close  to  the  observed  values  of  y^;  the  small  differences  between 
the  two  sets  of  values  is  due  to  the  rounding  errors  in  obtaining 
the  xi's,  i  =  1,2,..., 33.  These  computations  were  made  on  a  computer 
using  double  precision.  In  practice  with  a  longer  series  one  would  use 
the  Fast  Fourier  Transform.  Computer  packages  are  available. 

It  is  of  interest  to  note  that  the  largest  pair  of  coefficients 
(except  for  the  constant  x^)  is  Xg  =  129. *+25  and  x^,  =  102. 646.  These 
contribute 

129. *+25  cos  2it—  +  102.646  sin  2tt— -  »  162.073  cos  (21^  -  .6705) 
to  the  sum.  This  corresponds  to  a  period  of  11  years. 


Example  2 

In  Table  4.2  we  present  some  data  on  the  average  bi-monthly 
expenses—^  y  (in  local  currency),  of  a  typical  family  in  Kabiria  (a 
city  in  Northern  Algeria)  over  the  time  period  Jan. -Feb.  1975  through 
Nov. -Dec.  1977.  We  would  like  to  obtain  a  Fourier  representation  of 
the  yt ,  t  *  1 ,2 , . . . , 18. 

Even  though  the  bi-monthly  expenses  should  constitute  a  periodic 
series  with  a  period  of  6,  we  note  that  due  to  the  randomness  of  the 
data  y^  4  ^t+6*  ^  Thus  we  will  have  to  obtain  18  Fourier 

coefficients  x^,...,x^g  to  obtain  the  desired  Fourier  representation. 
(We  could  of  course  treat  this  as  periodic  data  with  a  period  6  and 
obtain  the  6  Fourier  coefficients  x*,...,xg). 

^Average  of  the  actual  expenses  for  two  months. 


We  shall  first  obtain 


-  -A_  V 

Xl  "  /l8  J: 


y*  =  22.3 


x2k  =  /  IS-  +£/t  COS  "T 


■V®  21Ik 

Z  y,  cos  -=— t  ,  k  =  1,2,...,? 


■ —  -i  Q 

/2  v  .  2^k 

S2k+1  —  »  “  ^t  ~ T  ^  k  —  1,2,. ..,8  , 

t**  1 


TV8  t  i 

so  -  /rjr  Z  yJ-l)  =  ~  (10.15)  =  2.3923779  . 

XO  10  t=1  t  /Jg 


The  computed  values  of  x1>---»x10  are  shown  in  column  4  of 
Table  4.2.  Since  T(=  18)  is  even,  the  Fourier  representation  of 
yt,  t  =  l,2,...,l8  is  given  as 


r~  X  3 

yt  =  /  B  [^f  +  J2(x2k  cos  IF*  +  X2k+1  sin  IFt}  *  xl8  ^f~] 


The  values  of  y  computed  by  using  the  above  formula  are  given 
in  column  3  of  Table  4.2.  The  slight  disparity  between  the  observed 
values  y  (column  2  of  Table  4.2),  and  Fourier  representation  of  y 
(column  3  of  Table  4.2),  is  due  to  the  rounding  and  computational  errors 
in  obtaining  the  x^,  i  =  l,2,...,l8. 

In  Figure  4.3,  we  show  a  plot  of  the  observed  values  of  y  . 
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Table  4.2 


The  Average  Bi-Monthly  Expenses  of  a  Family  in 
Kabiria  and  Their  Fourier  Representation 


Time  t 

Average  Bi-Monthly 
Expenses,  y 

Fourier  Represen¬ 
tation  of  y 

Fourier  Coefficients  x. 
Corresponding  to  x 

Frequency  k/18 

xk 

k 

1 

Jan-Feb 

75 

4.71 

4.71586 

— 

22.304505 

1 

2 

Mar-Apr 

75 

3.80 

3.79525 

.254477 

2 

3 

May-Jun 

75 

3.33 

3.33307 

.269736 

3 

4 

Jul-Aug 

75 

9.50 

9.49889 

.285116 

4 

5 

Sept-Oct  ! 
75  S 

6.21 

6.20894 

-.777919 

5 

6 

Nov-Dec 

75  ! 

4.27 

4.27305 

i 

-1.22643 

6 

7 

Jan-Feb  1 
76  j 

4.34 

l 

4.3353 

-6.17766 

7 

8 

Mar-Apr 

76 

4.31 

4.31574 

.364677 

8 

9 

May-Jun 

76 

3.65 

3.64386 

-.066686 

9 

10 

Jul-Aug 

76 

9.67 

9.67573 

-.903802 

10 

11 

Sep-Oct 

76 

j  5.33 

5.32531 

-.151617 

11 

12 

Nov-Dec 

76 

j  3.00 

3.00306 

-4.96687 

12 

13 

Jan-Feb 

77 

5.31 

1 

5.3089 

4.34145 

13 

14 

Mar-Apr 

77 

3.34 

3.33897 

j 

.709414 

14 

15 

May-Jun 

77 

3.36 

| 

3.363 

.652794 

15 

16 

Jul-Aug 

77 

10.5 

10.4953 

.00203447 

16 

17 

Sept-Oct 

77 

6.00 

6.00566 

-.414406 

17 

18 

Nov-Dec 

77 

4.00 

3.99402 

2.3923779 

18 

Bl-  MONTHLY  EXPENSES 
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Here  the  two  largest  pairs  of  coefficients  (except  the  constant 
x.^ )  are : 

Xg  ■  1.2264  and  x^  =  -6.1776  ,  and 

x^2  =  -4.96687  and  x^  =  4.34l45 

They  contribute 

-1.2264  cos  -  6.1776  sin  =  6.2982  cos  (~g^  -  1.3748)  , 

-4.96687  cos  +  4.34l45  sin  «  6.5968  cos  (— ^—  +  .7183) 

to  the  sum.  This  corresponds  to  a  period  of  12  months  and  6  months 
respectively  (note,  the  data  is  bi-monthly). 

We  shall  reconsider  this  data,  and  perform  some  additional  analysis 
on  it  in  Sections  4.3  and  4.4. 


Let  us  consider  the  periodic  function  with  period  5 


f(t)  =l-t2  ,  0  <  t  <  1  , 

=  -1  ,  1  <  t  <  3  , 

-  J-(t  -  3)2  ,  3  <  t  <  5  • 
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We  would  like  to  approximate  this  function  by  a  finite  sum  of  Fourier 
terms  with  coefficients  and  for 

i )  k  =  0,1,  and 
ii)  k  =  0,1, 2, 3. 

Note  that  the  periodic  function  f(t)  is  defined  for  all  values  of  t, 
0  <  t  <  “.  Since  the  function  is  periodic  with  period  $  =  5,  it 
suffices  for  us  to  concentrate  on  the  range  0  <  t  <  5-  From  (4.7), 
(4.8)  and  (4.9),  we  have 


5  1  3  5 

a  -  i  /  f(t)dt  =  i  [/(l-t2)dt  +  /(-l)dt  +  jl(t  -  3)2dt] 

0  5  0  ^  0  1  3 

_2_ 

"  15 


a  =  ^  f  f(t)  cos  dt  =  .9107  , 

l  b  o  5 


a  =  §■  7  f(t)  cos  ^-2t  dt  =  0.15  » 

2  5  o  5 


a  =  ~  J  f(t)  cos  ^3t  dt  =  0.10 

3  5  o  > 

6  =  ~  j  f(t)  sin  ^-t  dt  =  -.3768  , 

1  '  0  5 


8  *  |  /  f(t)  sin  ~-2t  dt  =  .2033  , 

2  3  o  5 


B=  f- /  f(t)  sin  ^-3tdt  =  .180  . 

3  5  o  5 


Thus  our  approximation  for  f(t)  is  given  by 


i)  f(t)  =  a0  +  <*!  cos  y-t  +  ^  sin  y-t 

=  -.13  +  .9107  cos  yt  -  .37680  sin  y-t  ,  0  <  t  <  5  ; 

ii)  f(t)  =  aQ  +  a±  cos  y-t  +  &1  sin  y-t  +  ot2  cos  y-2t  +  B?  sin  y-2t 

+  (y  cos  y-  3t  +  B^  sin  y-3t 

=  -.13  +  .9107  cos  22- 1  -  .37680  sin  ^-t  +  .15  cos  %2t 

5  5  5 

+  .2033  sin  ^-2t  +  .10  cos  ^-3t  +  .180  sin  ^-3t  , 

?  ?  5 

0  <  t  <  5  . 

In  Figure  lt.lt  ¥e  show  a  plot  of  f(t)  by  the  boldface  line  and 
a  plot  of  the  above  two  approximations  to  f(t)  by  the  dotted  lines. 

Based  upon  an  examination  of  these  plots  one  can  see  that  there  has 

been  a  significant  improvement  in  the  approximation  in  going  from 

k  =  0,1,  to  k  =  0,1, 2, 3,  the  latter  approximation  reveals  a  flatter 

curve  at  the  bottom  than  the  former.  In  principle,  as  we  increase  k 

the  approximation  gets  better.  The  computation  of  the  y's  and 

the  3 ^ f s  is  rather  cumbersome  and  can  be  best  accomplished  on  a  computer. 

U.2  Statistical  Estimation  Procedures  for  Cyclical  Trends 

We  shall  now  turn  to  the  main  theme  of  Section  U;  that  is,  the 
statistical  estimation  of  cyclical  trends.  Recall  that  our  basic  model 


FIGURE  4.4  A  PLOT  OF  THE  FUNCTION  f  (I)  AND  ITS  APPROXIMATIONS  USING  FOURIER 
TERMS  WITH  k-0.1.  AND  k-0,1,2,3). 


for  an  observed  time  series  y^,  t  =  1,2,...,T,  is 
yt  =  f(t)  +  ut 

p  2 

with  Su  =  0,  &u.  =  a  ,  and  &u  u  =  0,  t  4  s.  Assume  that  f(t)  is 

ti  "C 

periodic  with  a  known  period  which  divides  T.  Our  goal  is  to  obtain 
an  estimate  of  f(t)  and  make  some  inferences  about  it  using  the 
T  observed  values  y1,y2» . . . ,yT- 

Since  f(t)  is  periodic,  we  can  represent  it  by  a  linear 
combination  of  sines  and  cosines  using  the  methodology  of  Section  4.1. 

To  this  effect,  suppose  that  T  is  odd,  and  consider  the  set  of  integers 
I  =  (1,2,..., — | — >;  let  {^.kg, . . .  jk^)  be  any  proper  subset  of  1. 

(A  proper  subset  is  strictly  smaller  than  the  full  set.)  For  example, 
let  T  be  9,  so  that  the  set  of  integers  I  is  {l,2,3,4>;  then  we 

can  choose  (k^  =  1,  k^  =  3,  and  k^  *  U  as  a  subset  of  I.  In  order 

2 

to  be  able  to  estimate  o  (and  also  test  hypotheses  about  some 
parameters  to  be  introduced  later),  it  is  important  that  {k^,...,k^} 
be  a  proper  subset  of  I,  and  not  be  equal  to  I. 

Having  chosen  the  subset  k^,kg, . . . ,k^,  let  us  consider  functions 
of  the  form  sin  (~  kjt)  and  cos  (^-k^t),  J  a  l,2,...,q;  the  periods 
of  these  functions  are  T/kj  ,  j  =  l,2,...,q.  Following  the  development 
in  Section  4.1.3,  let  us  consider  f(t)  as 


-82- 


where  a^,  a(kj)  and  8(kj)  are  the  coeff‘icients  associated  with 
the  trigonometric  terms.  Note  that  in  the  above  representation*  the 
trigonometric  terms  with  period  2  are  not  included.  In  order  to  include 
trigonometric  terms  with  a  period  2,  T  must  be  even;  this  case  will 
be  consider  later.  If  we  let 


p(kj)  =  /i2(kj)  +  B2(kj) 


0(kj)  =  arc  tan 


e(k  ) 

=4 


then 


“(kj)  *  P(kj )  cos  6(kj)  and  6(kj)  =  p(kj)  sin  0(k  )  , 

and  so 


3  2*k 

(U.ll)  f(t)  *  a  +  l  p(k  )  cos  t  -  6(k,)]  . 

J*1  J  1  J 

Using  the  observed  values  of  the  series  ,y2 , . . .  ,yT,  our  objective 

is  to  obtain  the  least  squares  estimators  of  a  a(k  )  and  8(k  ), 

®  J  J 

J  *  l,2,...,q.  We  shall  write 

/  2Trkl  2*k_ 

yt  "  °0  +  p(kl}  006  [  T  ■  e(kl)]  +  p(k2}  COS  [  T  *  -  e(k2)] 

2irk 

+  ...  +  p(k(j)  cos  t  -  0(^)1  +  ut  ,  t  «  1,2,. ..,T  , 

as  our  model. 
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By  taking  advantage  of  the  orthogonality  property  of  the  trigono¬ 
metric  terms,  and  by  using  standard  techniques,  the  least  squares  esti¬ 
mators  of  aQ,  <*(kj)  and  0(kj)  are: 


ao  -  ¥  1/ 1  - 


t=l 


(4.13)  a(k  )  =  =•  l  y .  cos 
J  1  t=l  r 


2nk 


t  ,  J  =  1 ,2, . . . ,q  , 


211k. 


(4.14)  b(kj)*-  [  yt  sin  t  ,  J»l,2,...,q  . 

t— 1 


Because  of  orthogonality,  the  above  estimators  are  uncorrelated; 

2 

also  the  least  sqaures  estimator  of  a  takes  a  particularly  simple  form 


(4.15) 


2  t 
81  = 


J  y?  -  Ty  -  |-T  l  fa2(k  )  +  b2(k  )] 
=1  _ _ 2  j=i  J l _ 

T  -  (2q  +  l) 


and  the  estimates  of  p(k  )  and  0(k.)  are 

J  J 


(4.16)  R(kj)  =  /a2'(k  )  +  b2(kj)  , 


b(k  ) 

(4.17)  0(kj)  -  arctan  ,  J»l,2,...,q 

J 


respectively. 


When  T  is  even,  we  consider  subsets  of  the  integers  {1,2,..,,—}. 
For  T/2  we  have  cos  t  =  cos  irt  =  (-1)^  and  sin  t 

*  sin  irt  =  0;  there  is  only  one  function.  When  T  is  even,  we  can  include 
the  coefficient  associated  with  a  period  2,  writing  f(t)  by 


A.  .  C.  II  A.  ( 

f(t)  =  «0  +  I  [a(kj)  cos  — y*-  t  +  B(k  )  sin  — ^  t]  +  o^C-l)1 
•1 


The  least  squares  estimator  of  is 


(4.18)  aT/2  =  ^  l  (-1)^  • 


Thus,  the  estimator  of  f(t)  is 


(4.19)  f(t)  =  aQ  +  l  [a(k^)  cos  ~  k^t  +  b(k^)  sin  ~  k^t]  , 

ij  *1 


if  T  is  odd,  and 


f(t)  =  aQ  +  l  [a(kj  )  cos  ^jf-  kjt  +  b(k^)  sin  ^  kjt]  +  a.7/2(-l)z  , 

J  *1 


if  T  is  even. 


An  estimate  of  o  when  T  is  even  is  given  by 


T  3 

r  2  -f-2  .  2  s  r  r  2, 


(4.20)  zL  - 


I  yt  -  T(y  +  a  /2)  -  jT  l  [a  (k  )  +  o  (k  )] 
t»l  X  T/<i  ^  i»l  J  J 


r2q  +  2' 


'  .TYJ 

► 

flpip 

w3 
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Remarks : 

We  note  that  the  expressions  for  a0,  a(kj),  and»  "b( )  given 
above  are  similar  to  those  of  the  discrete  Fourier  coefficients 
x1,x2,...,xT  of  a  sequence  of  numbers  y^,. . . ,y^.  Specifically, 
a(k)  =  /t/2  x2k  and  b(k)  =  /t/2  x2k+1.  In  fact,  if  we  had  to  choose 
{k^kg,. . .  ,k^}  =  {1,2, . . . ,(T  -  l)/2}  (when  T  is  odd),  then  the  estimated 
values  of  aQ,  a(k^)  and  b(kj)  would  be  proportional  to  the  discrete 
Fourier  coefficients,  so  that  y  =  f(t).  However,  this  choice  of  the 

v 

2 

{k  }  will  not  permit  us  to  estimate  a  and  test  hypotheses  about 
J 

the  a(k  )  and  the  f}(k  ). 

J  J 


k.  3  The  Periodgram  and  the  Spectrum 

There  are  two  distinct  ideas  which  motivate  our  study  of  cyclical 
trends.  One  is  to  describe  seasonal  variation  where  the  periods  are 
known.  The  other  is  to  discover  "hidden  periodicities".  The  periodgram 
and  the  spectrum  are  important  tools  which  enable  us  to  identify  the 
periodic  components  in  a  time  series  and  provide  us  with  a  vehicle  to 
interpret  visually  the  estimates  a(kj)  and  b(kj). 

A 

Note  that  f(t)  given  by  equation  (U.19)  can  also  be  written  as 


(b.2l) 


q 

f(t)  ■  a  +  l  R(k  )  cos  (~k.t 
J«1  J  J 


-5(ki>) 


T  odd 


■  a 


0 


+  ^  R(k  )  cos  (-jrk.t 

j-1  J  J 


-  0(kj))  +  a^-l)* 


T  even 


S. 
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The  quantity  R(k  )  given  in  (^.l6)  denotes  the  estimated  amplitude 
J 

of  the  cosine  c«rve  with  frequency  k./T  and  period  T/k  . 

J  J 

2 

A  plot  of  R  (k.)  versus  T/k,  (the  period)  is  called  the 
J  J 

periodgram.  When  T  is  odd,  the  periodgram  may  he  defined  for  the 

periods  T,  T/2 ,T/3 , . . . ,2t( T  -  l),  whereas  if  T  is  even,  the  period- 

gram  may  be  defined  for  the  periods  T,  T/2,T/3,. , . ,2T.  In  either 

case,  the  periodgram  is  defined  for  periods  greater  than  or  equal  to  2. 

A  plot  of  R  (k  )  versus  k  /T  (the  frequency)  is  called  the 
J  J 

spectrogram.—1^  It  is  defined  for  the  frequencies  §■*•••  T 

12  11 

is  even,  and  for  the  frequencies  Tp»  • • • >jT  “  2T  if  T  is  odd'  In 
either  case,  the  spectrogram  is  defined  for  frequencies  less  than  or 
equal  to  a  half.  The  frequencies  being  more  evenly  spaced  than  the 
periods  makes  the  spectrogram  more  convenient  to  use  than  the  periodgram. 


J+.S.l  Interpretation  of  the  Spectrum 

Since  a(k  )  and  b(k.)  are  the  least  squares  estimators,  they 
J  J 

are  the  values  of  a(k  )  and  B(k.)  that  minimize  (for  odd  T) 

J  0 

T  2  irk,  2nk.  2 

I  (yt  -  (a(kj)  cos  “y*-  t  +  6(kj)  sin  -y“-  t)]  , 

t— 1 

/~5 - ~ -  -  b(k.) 


/p  p  A  A  .  / 

j  a  (kj)  +  b  (kj)  and  9(k^)  *  arctan  ^  ^  are 

J 

the  values  of  p(k  )  and  0(k.)  that  minimize 
J  J 


—'ll!  the  literature  there  is  confusion  about  these  terms.  Many  authors 
use  "periodgram"  to  mean  the  graph  of  amplitude  against  frequency. 


<■  T.  •• 


•  w',’ 


I  [yt  -o(kj)  cos  <f-  kjt  -e(k  i 


The  minimum  value  of  this  sum  of  squares  is 


j  »t-5«.2<V4b2<V)- 1  /t'b'V  • 


In  view  of  the  above,  we  conclude  that  R(k  )  is  a  measure  of  how 

u 

closely  a  trigonometric  function  with  frequency  k  /T  fits  the  observed 

J 

data.  More  pragmatically,  if  a  series  of  length  T  has  period  0, 

then  the  value  of  R(k  )  corresponding  to  k  =  T/0  will  tend  to  be 

J  J 

largest  among  all  the  other  R{k  )'s. 

J 

In  practice,  a  plot  of  the  spectrogram  enables  us  to  detect  the 

periods  in  the  time  series  by  our  identifying  the  frequencies  k  /T 

0 

associated  with  large  values  of  R(k  ). 

J 

In  this  section  we  consider  the  spectrogram  at  giving  information 
about  the  trend.  In  a  later  section  we  shall  interpret  the  spectrogram 
in  terms  of  the  Fourier  transform  of  the  correlation  function. 


4.3.2  Example  Illustrating  the  Estimation  of  a  Cyclical  Trend 
and  the  Spectrogram 

We  shall  illustrate  the  methodology  of  Sections  4.2  and- 4.3  by 
considering  the  data  in  Table  4.2  on  the  average  bi-monthly  expenses 
of  a  family  in  Kabiria.  The  sample  Fourier  coefficients  and  the 
estimated  amplitudes  are  given  in  Table  4.2.1.  (Note  a(h)  »  / T/2  x,^ 
-  3x2h,  b(h)  ■  /t72  x2h+1  *  3x2h+1,  h  =  1,...,8,  and  •  /IF  x^.) 


{  4 
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The  estimated  amplitudes  are  graphed  in  Figure  4.4.1. 


It  seems  reasonable  to  consider  seasonal  variation;  that  is, 
mean  value  function  can  be  suspected  as  having  a  period  of  6  since 


Table  4.2.1 


The  Discrete  Fourier  Coefficients  for  the  Data 
in  Table  4.2  on  the  Average  Bi-Monthly  Expenses 


Values  of 

kJ 

a(kt) 

b(tj> 

”2(V 

B(kj> 

1 

.0848 

.0894 

.0152 

.1236 

2 

.0950 

-2593 

.0762 

.2761 

3 

-.4088 

-2.0592 

4.4075 

2.0994 

4 

.1215 

-.0222 

.0152 

.1235 

5 

-.3012 

-.0505 

.0933 

.3054 

6 

-1.6556 

1.4471 

4.8353 

2.1989 

7 

.2364 

.2176 

.1032 

.3213 

8 

.0067 

-.1381 

.0193 

.1383 

9 

.5639 

_ i. 

.3180 
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the  data  are  bi-monthly.  The  trigonometric  functions  with  period  6 
correspond  to  the  integers  3,  6,  and  9;  that  is  frequencies  1/6,  1/3, 
and  1/2.  We  consider 


f(t)  =  aQ  +  l  [a(  3j )  cos  ^3Jt  +  6  ( 3  j )  sin  |^3jt]  +  a^-l)' 

J  1 


It  will  be  seen  in  Figure  l+.l+.'l  that  the  estimated  amplitudes 
at  frequencies  1/6,  2/6,  and  3/6  are  considerably  larger  than  at  the 
other  frequencies.  This  fact  confirms  that  the  seasonal  variation  is 
dominant.  We  estimate  the  variance  by 

!8  2 
l  Yt  ~  l8a^  -  9  l  F  (3j)  -  l8a^ 

2  _  t-1.  j  _  °  J=1 _ l 

S  ~  12 

*  .21+18  , 


and  s  =  .1+917 


1+.1+  Tests  of  Hypotheses  and  Confidence  Regions  for  Coefficients 

Because  they  are  the  least  squares  estimates,  £(aQ)  =  <*0» 

fi(aT/2)  *  “T/2’  =  a(kj),  and  £(b(kj))  =  B(k^),  j  =  1,2,. . . ,q. 

2 

The  variances  of  and  a^g  are  0  /Tf  whereas  the  variances  of 

2 

a(k.)  and  b(k  )  are  2o  /T.  To  verify  the  latter,  let  us  recall  that 
J  J 
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2frk, 


*<kJ)  -  T  JE/t  C°6 


and  from  the  orthogonality  of  the  cosine  function  (see  equation  (k.l)), 

2 

the  variance  of  a(k.)  is  2c  /T;  similarly,  the  variance  of  b(k.) 

J  J 

is  2a2/ T. 

If  we  assume  that  the  y^'s  are  normally  distributed,  then,  since 

the  estimators  aQ,  a^g,  and  are  uncorrelated»  "they  are 

2  2 

also  normally  and  independently  distributed.  Let  =  s  for  T  odd 


2  2 

and  Sg  =  s  for  Tesen.  Then 


(T  -  2q  -  l)s2 


has 


a  chi-square  distribution  with  (T  -  2q  -  l)  degrees  of  freedom,  and 
2 


(T  -  2q  -  2 )s 


has  a  chi-square  distribution  with  (T  -  2q  -  2)  degrees 


of  freedom. 

A  null  hypothesis  which  may  be  of  interest  is  whether  there  exists 
a  cyclical  term  with  a  minimum  period  T/k^  for  some  specific  j.  Thus, 
we  wish  to  test  the  null  hypothesis  Hq,  that 

o(k  )  »  B(k.)  *  0  ,  or  equivalently 

0  J 

p(kj)  =  0 


versus  the  alternate  hypothesis  that  p(k.)  /  0. 

J 

Under  the  null  hypothesis  both  the  a(k.)  and  the  b(k.)  are 

J  J 

2 

independent  and  are  distributed  normally  with  mean  0  and  variance  2o  /T. 
Thus  a2(kj )/(2c2/T)  and  b2(k^ )/(2c2/T)  are  independent  and  each  has 


a  chi-square  distribution  with  one  degree  of  freedom.  It  follows  then, 

2  2  2 

that  (a  (k  )  +  b  (k  ))T/(2a  )  has  a  chi-square  distribution  with  2 
J  J 

2  2 

degrees  of  freedom;  equivalently,  R  (k  )T/2o  has  a  chi-square 

J 

distribution  with  2  degrees  of  freedom. 

To  complete  our  discussion  on  testing  the  hypothesis  HQ,  let  us 
denote  by  p,  the  number  of  coefficients  that  we  have  estimated. 
Specifically, 

p  =  2q  +  1  ,  if  T  is  odd,  and 

=  2q  +  2  ,  if  T  is  even  and  the  term  estimated  by  us 


Since 


(T  -  p)s 


has  a  chi-square  distribution  with  (T  -  p) 


degrees  of  freedom,  the  ratio 


tr2(V 


/  2 


=  -2—  R2(k.)  ,  i  -  1,2  , 


(T  -  p)s. 


~/(T  -  p) 


4  s2 

i 


has  an  F-distribution  with  2  and  (T  -  p)  degrees  of  freedom.  We  can 

use  this  result  to  test  the  null  hypothesis  using  standard  procedures. 

If  we  want  to  test  the  hypothesis  o(k  )  =  0,  j  =  l,.,,,q, 

J 

aT/2  =  T  4s  even,  we  use 
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which  has  the  F-distribution  with  2q  +  1  and  T  -  2q  -  2  degrees 
of  freedom,  respectively,  when  the  null  hypothesis  is  true. 


4  .4.1  Exampl e 

Let  us  consider  again  the  bi-monthly  expenditures.  The  hypothesis 
that  there  is  no  seasonal  variation  is  the  hypothesis  that 


The  appropriate  test  criterion  is 


9  [  R2  ( 3 )  ♦  R2  ( 6 )  ] 


+  18; 


5  s 


91-81 
5  x  .2418 


=  75-94  . 


This  is  to  be  referred  to  the  F-distribution  with  5  and  12  degrees  of 
freedom.  It  is  clearly  significant.  (This  is  obvious  from  Figure  U » i+ . 1 . ) 

Given  that  there  is  seasonal  variation,  we  can  ask  whether  all  of 
the  5  terms  are  needed.  The  criterion  for  testing  p(3)  =0  is 


18  x  4.4075 

4  x  .3224 


61.52  , 


which  is  obviously  significant.  To  test  the  null  hypothesis  that 
=  0,  we  use 


/F  a9  /lE  x  .5639 
s  =  .4917 


4.8656  , 


'V.>i  ' 


which  is  referred  to  the  t-distribution  with  12  degrees  of  freedom. 

It  is  significant.  (tj^.Ol)  =  3.054.) 

We  could  ask  whether  any  of  the  coefficients  with  periods 
l/l8,  2/l8,  4/l8,  5/18,  7/18,  8/18  are  needed.  However,  the  corresponding 
estimated  amplitudes  are  so  small  compared  to  the  seasonal  amplitudes 
that  they  cannot  be  important  (even  if  statistically  significant). 

The  above  considerations  lead  us  to  propose  f(t),  where  f(t) 
is  a  trigonometric  function  of  period  6,  as  our  estimate  of  the  cyclical 
trend  f(t).  Specifically, 

f(t)  =  5.257  -  .409  cos  1^-  3t  -  2.059  sin  jig  3t  -  I.656  cos  6t 
+  1.U47  sin  ||  6t  +  . 56U(-l)t  ,  t  =  1 , . . . ,18  . 

In  Table  4.2.2  we  give  the  computed  values  f(t)  together  with 

| 

the  actual  values  y  ,  and  the  residuals  y  -  f(t),  t  =  l,...,l8. 

In  Figure  4.4  we  give  a  plot  of  the  actual  and  the  fitted  values  y 
/\ 

and  f(t),  respectively.  This  enables  us  to  Judge  visually  the  appro- 
'  priateness  of  f(t).  Figure  4.5  is  a  plot  of  the  residuals  (y^  -  f(t)) 
against  time  t.  The  analysis  of  residuals  in  an  important  part  of  a 
statistical  study,  since  the  residuals  give  us  an  indication  of  how  well 
the  fitted  values  f(t)  behave  in  relationship  to  the  actual  values  y^.. 

In  Figure  4.6,  we  plot  the  fitted  values  f(t)  against  the  observed 
values  y^;  such  a  plot  is  also  a  part  of  the  statistical  analysis. 
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Table  k.2.2 

Actual  Bi-Monthly  Expenses  of  a  Family  in  Kabiria  and 
Their  Fitted  Values  Using  a  Trigonometric  Function  of  Period  6 


Time  t 

Actual  Average 
Bi-Monthly  Expenses 

yt 

Fitted  Values 
f(t) 

Residuals 

H  -  *<0 

1 

It.  71 

4.787 

-.077 

2 

3.80 

3.817 

-.017 

3 

3.33 

3.447 

-.117 

4 

9.50 

9.889 

1 

4=~ 

O 

O 

5 

6.21 

5.84j 

.370 

6 

U.  27 

3.757 

.513 

7 

It.  34 

4.787 

-.447 

8 

4.31 

3.817 

.493 

9 

3.65 

3.446 

.204 

10 

9.67 

9.889 

-.220 

11 

5.33 

5.847 

-.517 

12 

3.00 

3.757 

-.757 

13 

5.31 

4.787 

.523 

lU 

3.34 

3.817 

-.477 

15 

3.36 

3.447 

-.087 

16 

10.50 

9.889 

.610 

17 

6.00 

5.847 

.153 

18 

4.00 

3.757 

.243 

X  ACTUAL  VALUES 
©  FITTED  VALUES 
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